Creating your input files¶
Overview¶
You will first need to supply the data that your website will use. In addition, you will need to create a parameter file (in Excel or a text editor) that specifies how the webpage will look. Here, we will talk about what format your data files should be and how to create your parameter file.
This step can take the longest (> 10 minutes, depending on your website design).
Tutorial/Quickstart¶
For getting a tutorial website running quickly:
- You can download an example dataset and parameter file .
- After double-clicking to extract the file, you will find a tutorial_files folder.
- The tutorial_files contains a data file (ds.tutorial.data.csv) and a complete parameter file (ds.tutorial.parameters.xlsx).
- To simply test dataspectra, you can use these and skip to Step 2.
- For the uninitiated, I would recommend opening these files and taking a look to see what they look like.
For building your own website:
- I would recommend starting off with getting a tutorial up and running.
- Then, take a look at the template file (ds.tutorial.parameters.template.xlsx) included with the tutorial material.
- You can fill in the parameters and delete unnecessary parts as needed.
Tutorial File Contents:
- tutorial_files
- The input directory where the example files are located.
- ds.tutorial.data.csv
- The example tutorial dataset file. The primary search term is in the first column. The rest of columns are different samples. The rows represent different genes.
- ds.tutorial.metadata.csv
- A file that holds the metadata for the data file.
- ds.tutorial.parameters.xlsx
- The example tutorial parameter file. Information about this file is described below.
- ds.parameter.template.xlsx
- The file has the template for a parameter file, with no information filled in.
Organization of Input Files¶
All the input, including the data file and the parameter file should be in one folder. We will refer to this as the “input directory”.
- Input directory
- The folder or directory where all of your input files are kept.
Data Files Overview¶
The data files are the large tables that hold the data that you would like to plot. Your primary search terms should be along one column (rather than a row). This means that your data should usually be taller than wide. The datafiles can be .csv (comma-separated) or .txt (tab-delimited) format.
- Primary search terms
- The terms that your user will search for in your website.
Parameter File Overview¶
The parameter file specifies how your webpage looks and how your data file is formatted.
There are 5 types of parameters to specify:
- Figure
- Side Bar
- Site
- Panel
- Dataset
Each type, except for the “Site” and “Side Bar”, can be used more than once. For example, you may have multiple datasets or multiple figures. Take a look at the tutorial file - ds.tutorial.parameters.xlsx - to see what it looks like.
A template parameter file is also included (ds.parameter.template.xlsx) to guide you in creating your own parameter file.
A few notes:
The columns should be separated by Excel boxes, commas, or tabs depending on the type of file you’re saving it as.
Make sure to save the file with the appropriate suffix (.csv for commas-separated columns, .txt for tab-separated columns, and .xlsx for Excel files)
If you are going to save a comma-separated file, make sure to not include any commas within a particular text column.
Unless otherwise specified, the variables for each parameter file should be in the file format as (for e.g. in a csv file)
variablename,The name of your variable variablekey,The variable key that you want to use
When specifying a new parameter type (e.g. a new figure, dataset, or panel) , you will need to specify the type in the first line of the parameter.
parameter
, parameterTypeSpecify the type of parameter file. Options are { dataset, figure, panel, site, sidebar }
parameter,dataset
The Site Parameter Type¶
Description¶
With the site parameter, you will specify the parameters for the entire site.
Variables¶
sitename
, name- The main title name for you website
topname
, name- A subheader name for your website. This is often your name or your lab name.
toplink
, websiteLink- A website link for that redirects when topname is clicked. This is often your personal web page or you lab web page.
defaultterm
, defaultTermName- The default search term that is initially used when visiting the site.
theme
, themeName- The style of your website. Options are “light” or “dark”
defaultpanel
, *defaultPanelName- The default panel that is initially displayed.
Example Site Parameter¶
parameter,site
sitename,BRAINSPAN
topname,DATASPECTRA
toplink,http://www.dataspectra.org
defaultterm,ARX
theme,light
defaultpanel,agepanel
The Dataset Parameter Type¶
Description¶
With the dataset parameter, you will specify how the data will be accessed and stored by the server. You can have multiple dataset parameter files.
Variables¶
datasetkey
, name- A unique name you will use to refer to this dataset in other parameters.
datasetfile
, filePath- The actual name of this data file in the input directory.
searchrowstart
, number- The row number (one-indexed) to start the search.
searchcol
, number- The column number (one-indexed) where the primary search term is located.
Example¶
parameter,dataset
datasetkey,brainspandata
datasetfile,ds.tutorial.data.xlsx
searchcol,1
searchrowstart,2
The Side Bar Parameter Type¶
Description¶
With the sidebar parameter, you will specify how the side bar is displayed. Because the order of the sidebar elements matter, you will start off by using the “START” term. This indicates that the following elements are ordered.
Ordered Variables¶
START
- Specifies that the subsequent terms should be ordered in the corresponding manner.
SEARCH
, placeholder- Creates the search box. placeholder is placed as the placeholder text in the search box.
SPACE
- Creates a space in the side bar.
BUTTON
, buttonText, panelkey- Creates a button that links to a specific panel. The text inside the button is specified by buttonText. The panel that it will link to is specified by panelkey.
Example¶
parameter,sidebar
START
SEARCH,SEARCH
SPACE
PANEL,Age,agepanel
PANEL,Distributions,distributionpanel
The Figure parameter type¶
Description¶
The figure parameter type encompasses all objects in the panel. This includes plots and titles. For each figure, you will need to create a separate figure parameter type. This section instructs how each figure accesses the data and how it is plotted. Because there a number of types of figures, we will only describe the format for two types - the title figure, and the barplot figure. The title figure will display a specific column of text from your dataset. We will use it to display the search term. Check out “Parameters” link for more detail on the other plots.
Here, we also have the START term, which will be used in your figure parameter type to distinguish the variables from the ordered rows.
The unordered variables will go first, and then the START term, and lastly the ordered rows. As the name suggests, the order of the ordered rows will be used in the panel.
Unordered Variables¶
figurekey
- A unique name for your figure
figuretype
- The type of figure. Options are (boxplot, barplot, scatterplot, mdscatter, violin, carousel)
valuelabel
- The unit label for the value you want. This will usually be on the y-axis.
title
- The name on the top of the figure. If you put “None” then no name will be placed.
datasetkey
- The datasetkey that this figure to accesses.
xtickfontsize
- This (when relevent) specifies the font size (in px) of your x-axis.
xtickangle
- This (when relevent) specifies the angle of orientation of your tick labels. (0 is horizontal, 90 is vertical)
Ordered variables (for barplot)¶
START
- Specifies that the subsequent terms should be ordered in the corresponding manner.
BAR
, name , columns , datarow, color- Adds a bar to your plot. name will be the label for this bar. columns can be specified by “-“, where 2-4 will refer to the columns 2,3 and 4. Individual columns can also be specified with a “$” separation (e.g. 2$3$4 accesses columns 2-4 in the dataset). color is specified with rgb values that are “;”-separated and surrounded by parantheses. (e.g. (155;155;155)). datarow is the dataset row that will be used for this element. Put “data” here, unless you are using a metadata file.
SPACE
- Adds an empty space next to the bar.
Example¶
parameter,figure
figurekey,mybarplot
figuretype,barplot
valuelabel,FPKM
title,Expression in brain
datasetkey,braindata
START
BAR,Astrocytes,2-4,data,(155;155;155)
BAR,Neurons,5-7,data,(155;155;155)
BAR,Microglia,8-10,data,(155;155;155)
SPACE
BAR,Total,11-13,data,(155;155;155)
Ordered variables (for title)¶
START
- Specifies that the subsequent terms should be ordered in the corresponding manner.
TEXT
, None , columns , datarow, color- Adds a term for your. name will be the label for this bar. columns - put here the column from the dataset that you want displayed. Usually 1, if you search term is the first column of your dataset. datarow is the dataset row that will be used for this element. Put “data” here, unless you are using a metadata file. color is specified with rgb values that are “;”-separated and surrounded by parantheses. (e.g. 155;155;155).
Example¶
parameter, figure
figurekey,agetitle
figuretype,title
datasetkey,brainspandata
START
TEXT,None,1,data,(0;0;0)
The Panel Parameter Type¶
Description¶
This parameter type specifies the layout of the panel. Since the panel can contain multiple figures, you can specify the width and height of the figures and how many figures per row. You will also specify here the information that goes in the tabs, which is included in all panels. Note that these websites have responsive designs, so the actual width will change as the user changes the size of the window. To accomodate this, we will represent width as the percent width of the window.
This parameter type also has a START term, to divide the unordered and the orderd variables as discussed above.
Unordered Variables¶
panelkey
- A unique name for your panel that will be used to reference the panel in other parameter types.
setname
- The text that you would like displayed as the header of the info section in the tabs.
info
- The text that you would like displayed in the info section.
citetext
- The text that you would like displayed in the citation section of the tabs.
citelink
- The webpage that you would like forwarded when a user clicks the citetext.
Ordered Variables¶
START
- Specifies that the subsequent terms should be ordered in the corresponding manner.
FIGURE
, figurekey, widthpct, har, rownum, colnum- Denotes the addition of a figure. figurekey should match the figurekey in the figure parameter file for the figure that you would like to display. widthpct is the percent width (1-100) of the panel that the figure should take up. har can be either the height in pixels (just a number), or it can be in the format R(ar) where “ar” refers to the aspect ratio that you would like to maintain. See the advanced section for more details. rownum is the row number for the figure in the panel. The first row should be 1. colnum is the col number for the figure in the panel. Leftmost colnum should be 1.
Example¶
parameter,panel
panelkey,agepanel
citetext,Allen Human Brain Atlas; Hawrylycz M.J. et al. (2012) An anatomically comprehensive atlas of the adult human transcriptome; Nature 489: 391-399. doi: 10.1038/nature11405
citelink,http://www.alleninstitute.org/
setname,Association between Age and Gene Expression
info,This data shows expression levels in brain with varying ages.
START
FIGURE,agetitle,100,200,1,1
FIGURE,agebarplot,100,400,2,1
FIGURE,ageboxplot,50,400,3,1
FIGURE,ageviolin,50,400,3,2
Troubleshooting¶
- All of the files should be in one folder.
- Files can be either .csv, .txt, or .xlsx.
- .txt files should be Tab-delimited.
- If you are using Excel, only the first sheet of each .xlsx file will be used.