![]() |
Data Explorer |
This Explorer allows to perform various statistical analyses and data mining operations in a very easy and intuitive way. As the name implies, this software aims at exploring data and getting quick insight of the order of magnitude of the observed objects. That's why it does focus on graphical representation and mouse driven operations, unlike the traditional statistical tools cluttered with numerous dialog boxes and lists with five decimal figures. You can, however, have the detailed numbers once your analysis is completed.
Overview |
![]() |
![]() |
Contingency table |
![]() |
![]() |
Weather data |
![]() |
![]() |
The Explorer is written in javascript and built with electron,
Download the latest version for darwin from the release page.
Download the latest version corresponding to your system (32bit or 64bit) from the release page. The application is bundled into a single exe file, thanks to BoxedApp Packer .
Download electron for linux, download the source of the Explorer from the release page, copy the app folder into electron/resources, then run Electron.
At launch time, the Explorer shows a window to choose the dataset to use. You can either drag and drop a file from your computer desktop, or click the clipboard button.
Various file formats are accepted :
Source | File extension | Remarks |
---|---|---|
Access | mdb , accdb | Access 2000 or higher |
ARFF / KEEL | * | No comments at the beginning of the file. The first line must be @relation |
BigQuery | * | A config file with a content like this: BigQuery client_secret:/full/path/to/my_private_key.json query:select * from lookerdata:cdc.project_tycho_reports limit 1000 timeout:60000 |
dBase | dbf | |
Excel | xlsx | The names of the fields are expected at the top of the columns |
JMP | jmp | |
JSON file | * | A JSON array of records |
LIMDEP / NLOGIT | lpj | |
MINITAB | mtw | |
MLwiN | ws | Uncompressed format only |
MongoDB | * | A config file with a content like this: mongodb host:192.168.0.121:27017 database:geo collection:countries query:{cont:{$eq:"EU"},pop:{$gt:50000000}} |
Mysql | * | A config file with a content like this: mysql host:192.168.0.2 user:bob password:secret database:test query:select * from mytable |
Postgres | * | A config file with a content like this: postgres host:192.168.0.2 user:bob password:secret database:test query:select * from mytable or: postgres connection:bob:secret@192.168.0.2/test query:select * from mytable |
R | rdb | Binary format only |
SAS | sas7bdat | Uncompressed format only |
SPLUS | sdd | |
SPSS | sav | Uncompressed format only |
SQL Server | * | A config file with a content like this: mssql host:192.168.0.121 username:bob password:secret query:select * from mytable |
Stata | dta | Stata 8 or higher |
Tabular file | * | The names of the fields are expected on the first line |
Bzip2 file | bz2 | The uncompressed file must be in one of the previous formats |
Gzip file | gz | The uncompressed file must be in one of the previous formats |
Web file | * | Contains the url of the data. The remote file must be in one of the previous formats |
If you click the clipboard button, the data must be in tabular form, with the name of the fields on the first line.
Once the data have been successfully loaded, the main window is displayed :
Here are the elements of the interface :
List of the categorical fields (aka "the pink zone"). By default only 10 fields are displayed. To resize the list, move the mouse just below the list and drag to shrink or extend the list. To scroll the list, move the mouse to the right of the list.
Icons of the existing analyses (graphs). To run a new analysis, just drag its icon to the workspace.
List of the numerical fields (aka "the blue zone"). By default only 10 fields are displayed. To resize the list, move the mouse just below the list and drag to shrink or extend the list. To scroll the list, move the mouse to the right of the list.
Icons of the tools
Status bar. This area gives at any time details about the object under the mouse, or the action your are about to do.
Dock This area is used to keep graphs that are temporarily removed from the workspace.
Version number
Memory usage
Workspace. This area is where the graphs are created and arranged.
To create a new graph, drag its icon to the workspace. Alternatively if you dont know which icon to look at, you can right-click or control-click on the workspace to get a menu with all the possible analyses.
A graph is represented by an area with different noticeable parts :
Close box. Click on this box to close the graph. All the computations done will be lost.
Option menu. Some graphs have different ways of representing the results. In that case click on this sign to bring up the menu to choose from. Alternatively, right-click or control-click within the graph.
Title bar. This area shows the current selection (see below). Click on this area to drag the graph around.
Slots. These are the places where you can define the parameters of the analysis. Depending on the graph, different combinations of slots are shown. On a pink slot you can drag a categorical field. On a blue slot you can drag a numerical slot. Parameters can be swapped by dragging from one slot to another one ( of the same graph, and of the same color ).
Resize box. Click on this box and drag to resize the graph.
To change the type of a graph, drag the icon of the new type onto the graph. The new analysis will retain the parameters and selection of the previous one.
Every analysis can be restricted to a part of the data only. The set of observations (records) currently processed by a graph is named the selection, and is displayed in the title bar . Initially, the selection consists of all the observations, and the title is blank.
Conversely, the selection of an existing graph can be changed by dragging a pie slice onto its title. This allows to conduct successively the same analysis on different parts of the data.
Dragging a slice to the title of a graph which already has a selection will combine the two sets.
If the two variables are the same, the resulting selection will be the union of the two sets. Example: a pie graph splits the data into Apples, Pears, Peaches, and Apricots. If you drag the apple slice to the title of another graph, the selection will be Apples. If you then drag the peach slice to the title of the graph, the selection will be Apples + Peaches
If the two variables are not the same, the resulting selection will be the intersection of the two sets. Example : a pie graph splits the data into Apples, Pears, Peaches and Apricots. If you drag the apple slice to the title of another graph, the selection will be Apples. If you change the variable defining the pie to split the data into Organic and Non-Organic, and drag the Organic slice to the title of the second graph, the selection will be Apples AND Organic.
When loading the data, the Explorer identifies fields containing only numbers as numeric, and all others fields as categorical. Sometimes it is desirable to change this. Several possibilities exist.
Drag a numerical field to the pink zone. The field is converted to categorical, the values are the same but as strings of characters.
Drag a categorical field to the blue zone. Each category gives a dummy variable of the same name, Therefore, there are as many dummies as categories of the initial field, and all the dummies are exclusive. Example : COLOR is the categorical field converted:
Original data:
ID | COLOR |
1 | Blue |
2 | Red |
3 | Green |
4 | Red |
Data after the conversion
ID | Blue | Red | Green |
1 | 1 | 0 | 0 |
2 | 0 | 1 | 0 |
3 | 0 | 0 | 1 |
4 | 0 | 1 | 0 |
Original data :
ID | COLOR | HEIGHT | WIDTH | DEPTH |
1 | Blue | 142 | 25 | 11 |
2 | Red | 175 | 12 | 16 |
3 | Green | 109 | 48 | 14 |
Data after the pivot :
ID | COLOR | PIVOT | COUNT |
1 | Blue | HEIGHT | 142 |
1 | Blue | WIDTH | 25 |
1 | Blue | DEPTH | 11 |
2 | Red | HEIGHT | 175 |
2 | Red | WIDTH | 12 |
2 | Red | DEPTH | 16 |
3 | Green | HEIGHT | 109 |
3 | Green | WIDTH | 48 |
3 | Green | DEPTH | 14 |