Creating your input files

Overview

You will first need to supply the data that your website will use. In addition, you will need to create a parameter file (in Excel or a text editor) that specifies how the webpage will look. Here, we will talk about what format your data files should be and how to create your parameter file.

This step can take the longest (> 10 minutes, depending on your website design).

Tutorial/Quickstart

For getting a tutorial website running quickly:

  • You can download an example dataset and parameter file .
  • After double-clicking to extract the file, you will find a tutorial_files folder.
  • The tutorial_files contains a data file (ds.tutorial.data.csv) and a complete parameter file (ds.tutorial.parameters.xlsx).
  • To simply test dataspectra, you can use these and skip to Step 2.
  • For the uninitiated, I would recommend opening these files and taking a look to see what they look like.

For building your own website:

  • I would recommend starting off with getting a tutorial up and running.
  • Then, take a look at the template file (ds.tutorial.parameters.template.xlsx) included with the tutorial material.
  • You can fill in the parameters and delete unnecessary parts as needed.

Tutorial File Contents:

tutorial_files
The input directory where the example files are located.
ds.tutorial.data.csv
The example tutorial dataset file. The primary search term is in the first column. The rest of columns are different samples. The rows represent different genes.
ds.tutorial.metadata.csv
A file that holds the metadata for the data file.
ds.tutorial.parameters.xlsx
The example tutorial parameter file. Information about this file is described below.
ds.parameter.template.xlsx
The file has the template for a parameter file, with no information filled in.

Organization of Input Files

All the input, including the data file and the parameter file should be in one folder. We will refer to this as the “input directory”.

Input directory
The folder or directory where all of your input files are kept.

Data Files Overview

The data files are the large tables that hold the data that you would like to plot. Your primary search terms should be along one column (rather than a row). This means that your data should usually be taller than wide. The datafiles can be .csv (comma-separated) or .txt (tab-delimited) format.

Primary search terms
The terms that your user will search for in your website.

Parameter File Overview

The parameter file specifies how your webpage looks and how your data file is formatted.

_images/layout_parameters.png

There are 5 types of parameters to specify:

  • Figure
  • Side Bar
  • Site
  • Panel
  • Dataset

Each type, except for the “Site” and “Side Bar”, can be used more than once. For example, you may have multiple datasets or multiple figures. Take a look at the tutorial file - ds.tutorial.parameters.xlsx - to see what it looks like.

A template parameter file is also included (ds.parameter.template.xlsx) to guide you in creating your own parameter file.

A few notes:

  • The columns should be separated by Excel boxes, commas, or tabs depending on the type of file you’re saving it as.

  • Make sure to save the file with the appropriate suffix (.csv for commas-separated columns, .txt for tab-separated columns, and .xlsx for Excel files)

  • If you are going to save a comma-separated file, make sure to not include any commas within a particular text column.

  • Unless otherwise specified, the variables for each parameter file should be in the file format as (for e.g. in a csv file)

    variablename,The name of your variable
                 variablekey,The variable key that you want to use
                 
  • When specifying a new parameter type (e.g. a new figure, dataset, or panel) , you will need to specify the type in the first line of the parameter.

    parameter, parameterType

    Specify the type of parameter file. Options are { dataset, figure, panel, site, sidebar }

    parameter,dataset
                 

The Site Parameter Type

Description

With the site parameter, you will specify the parameters for the entire site.

Variables

sitename, name
The main title name for you website
topname, name
A subheader name for your website. This is often your name or your lab name.
toplink, websiteLink
A website link for that redirects when topname is clicked. This is often your personal web page or you lab web page.
defaultterm, defaultTermName
The default search term that is initially used when visiting the site.
theme, themeName
The style of your website. Options are “light” or “dark”
defaultpanel, *defaultPanelName
The default panel that is initially displayed.

Example Site Parameter

parameter,site
             sitename,BRAINSPAN
             topname,DATASPECTRA
             toplink,http://www.dataspectra.org
             defaultterm,ARX
             theme,light
             defaultpanel,agepanel
             

The Dataset Parameter Type

Description

With the dataset parameter, you will specify how the data will be accessed and stored by the server. You can have multiple dataset parameter files.

Variables

datasetkey, name
A unique name you will use to refer to this dataset in other parameters.
datasetfile, filePath
The actual name of this data file in the input directory.
searchrowstart, number
The row number (one-indexed) to start the search.
searchcol, number
The column number (one-indexed) where the primary search term is located.

Example

parameter,dataset
             datasetkey,brainspandata
             datasetfile,ds.tutorial.data.xlsx
             searchcol,1
             searchrowstart,2
             

The Side Bar Parameter Type

Description

With the sidebar parameter, you will specify how the side bar is displayed. Because the order of the sidebar elements matter, you will start off by using the “START” term. This indicates that the following elements are ordered.

Ordered Variables

START
Specifies that the subsequent terms should be ordered in the corresponding manner.
SEARCH, placeholder
Creates the search box. placeholder is placed as the placeholder text in the search box.
SPACE
Creates a space in the side bar.
BUTTON, buttonText, panelkey
Creates a button that links to a specific panel. The text inside the button is specified by buttonText. The panel that it will link to is specified by panelkey.

Example

parameter,sidebar
             START
             SEARCH,SEARCH
             SPACE
             PANEL,Age,agepanel
             PANEL,Distributions,distributionpanel
             

The Figure parameter type

Description

The figure parameter type encompasses all objects in the panel. This includes plots and titles. For each figure, you will need to create a separate figure parameter type. This section instructs how each figure accesses the data and how it is plotted. Because there a number of types of figures, we will only describe the format for two types - the title figure, and the barplot figure. The title figure will display a specific column of text from your dataset. We will use it to display the search term. Check out “Parameters” link for more detail on the other plots.

Here, we also have the START term, which will be used in your figure parameter type to distinguish the variables from the ordered rows.

The unordered variables will go first, and then the START term, and lastly the ordered rows. As the name suggests, the order of the ordered rows will be used in the panel.

Unordered Variables

figurekey
A unique name for your figure
figuretype
The type of figure. Options are (boxplot, barplot, scatterplot, mdscatter, violin, carousel)
valuelabel
The unit label for the value you want. This will usually be on the y-axis.
title
The name on the top of the figure. If you put “None” then no name will be placed.
datasetkey
The datasetkey that this figure to accesses.
xtickfontsize
This (when relevent) specifies the font size (in px) of your x-axis.
xtickangle
This (when relevent) specifies the angle of orientation of your tick labels. (0 is horizontal, 90 is vertical)

Ordered variables (for barplot)

START
Specifies that the subsequent terms should be ordered in the corresponding manner.
BAR, name , columns , datarow, color
Adds a bar to your plot. name will be the label for this bar. columns can be specified by “-“, where 2-4 will refer to the columns 2,3 and 4. Individual columns can also be specified with a “$” separation (e.g. 2$3$4 accesses columns 2-4 in the dataset). color is specified with rgb values that are “;”-separated and surrounded by parantheses. (e.g. (155;155;155)). datarow is the dataset row that will be used for this element. Put “data” here, unless you are using a metadata file.
SPACE
Adds an empty space next to the bar.

Example

parameter,figure
             figurekey,mybarplot
             figuretype,barplot
             valuelabel,FPKM
             title,Expression in brain
             datasetkey,braindata
             START
             BAR,Astrocytes,2-4,data,(155;155;155)
             BAR,Neurons,5-7,data,(155;155;155)
             BAR,Microglia,8-10,data,(155;155;155)
             SPACE
             BAR,Total,11-13,data,(155;155;155)
             

Ordered variables (for title)

START
Specifies that the subsequent terms should be ordered in the corresponding manner.
TEXT, None , columns , datarow, color
Adds a term for your. name will be the label for this bar. columns - put here the column from the dataset that you want displayed. Usually 1, if you search term is the first column of your dataset. datarow is the dataset row that will be used for this element. Put “data” here, unless you are using a metadata file. color is specified with rgb values that are “;”-separated and surrounded by parantheses. (e.g. 155;155;155).

Example

parameter, figure
             figurekey,agetitle
             figuretype,title
             datasetkey,brainspandata
             START
             TEXT,None,1,data,(0;0;0)
             

The Panel Parameter Type

Description

This parameter type specifies the layout of the panel. Since the panel can contain multiple figures, you can specify the width and height of the figures and how many figures per row. You will also specify here the information that goes in the tabs, which is included in all panels. Note that these websites have responsive designs, so the actual width will change as the user changes the size of the window. To accomodate this, we will represent width as the percent width of the window.

This parameter type also has a START term, to divide the unordered and the orderd variables as discussed above.

Unordered Variables

panelkey
A unique name for your panel that will be used to reference the panel in other parameter types.
setname
The text that you would like displayed as the header of the info section in the tabs.
info
The text that you would like displayed in the info section.
citetext
The text that you would like displayed in the citation section of the tabs.
citelink
The webpage that you would like forwarded when a user clicks the citetext.

Ordered Variables

START
Specifies that the subsequent terms should be ordered in the corresponding manner.
FIGURE, figurekey, widthpct, har, rownum, colnum
Denotes the addition of a figure. figurekey should match the figurekey in the figure parameter file for the figure that you would like to display. widthpct is the percent width (1-100) of the panel that the figure should take up. har can be either the height in pixels (just a number), or it can be in the format R(ar) where “ar” refers to the aspect ratio that you would like to maintain. See the advanced section for more details. rownum is the row number for the figure in the panel. The first row should be 1. colnum is the col number for the figure in the panel. Leftmost colnum should be 1.

Example

parameter,panel
             panelkey,agepanel
             citetext,Allen Human Brain Atlas; Hawrylycz M.J. et al. (2012) An anatomically comprehensive atlas of the adult human transcriptome; Nature 489: 391-399. doi: 10.1038/nature11405
             citelink,http://www.alleninstitute.org/
             setname,Association between Age and Gene Expression
             info,This data shows expression levels in brain with varying ages.
             START
             FIGURE,agetitle,100,200,1,1
             FIGURE,agebarplot,100,400,2,1
             FIGURE,ageboxplot,50,400,3,1
             FIGURE,ageviolin,50,400,3,2
             

Troubleshooting

  • All of the files should be in one folder.
  • Files can be either .csv, .txt, or .xlsx.
  • .txt files should be Tab-delimited.
  • If you are using Excel, only the first sheet of each .xlsx file will be used.