Jekyll2023-12-29T11:06:06+00:00https://jwist.github.io/hastaLaVista/feed.xmlinteractive data visualization with Rinteractive data visualization in the browser with R and visualizer javascript projectStrategy for improved the characterisation of human metabolic phenotyping using COmbined Multiblock Principal components Analysis with Statistical Spectroscopy (COMPASS)2020-07-29T04:31:34+00:002020-07-29T04:31:34+00:00https://jwist.github.io/hastaLaVista/r,/metabolic/profiling,/phenotyping/2020/07/29/characterisation-of-human-phenotypes-using-R<p>We have recently published a strategy for improving human metabolic phenotyping using Combined Multiblock Principal components Analysis with Statistical Spectroscopy (COMPASS). The COMPASS approach is developed within R environment. The open access manuscript can be found <a href="https://doi.org/10.1093/bioinformatics/btaa649">here</a>.</p>
<p>In this blog, we describe how to get started.</p>
<p>Characterising and understanding how human phenotypes relate to population statistics requires the ability to ascertain the occurrence of certain traits in individuals from different populations. For NMR and MS spectroscopic based datasets, this means the ability to estimate the presence of a specific feature, aka a signal, across the whole dataset (population) and aggregate the results by sub-population. In doing so, we can estimate the occurrence that varying between populations. For example, if interested in type-2 diabetes in a population we could observe how the anomeric doublet at 5.23 ppm vary across sub-populations.
As NMR measurement is quantitative, the ideal solution for estimating the presence of a specific metabolite is to quantify the concentrations of that metabolite. However, this is not always as trivial as it sounds, partially because of peak overlap. To overcome this, COMPASS estimates the presence or absence of a signal, aka a trait, by computing the cross-correlation of a feature and comparing it against a “reference”. The “reference feature” (pattern /signal) is a feature of interest and this can be selected from the multivariate data analysis modelling pipeline. We will show you how to do this.</p>
<p>To run the COMPASS, it requires two R packages, <a href="https://github.com/kimsche/MetaboMate"><code class="language-plaintext highlighter-rouge">MetaboMate</code></a> for multivariate analysis and <a href="https://github.com/jwist/hastaLaVista"><code class="language-plaintext highlighter-rouge">hastaLaVista</code></a> <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> for interactive visualization. Instruction for installing both packages, please refer to the README files provided with both packages.</p>
<h2 id="data-modeling">Data modeling</h2>
<p>In the github.com/cheminfo/COMPASS repository, a demo <a href="https://doi.org/10.3389/fmicb.2011.00183">dataset</a> and file <a href="https://github.com/cheminfo/COMPASS/blob/master/multiblocking.R"><em>multiblocking.R</em></a> is provided to illustrate the functionality of COMPASS.
This script will run multiple principal component analysis (PCA) models, each with a block of 0.5 ppm. If desired, the user can modify the block size in line 24 of this demo file.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># define the blocks (here blocks of 0.5 ppm are preferred)</span><span class="w">
</span><span class="n">rangeList</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="m">0.5</span><span class="p">,</span><span class="w"> </span><span class="m">1.000</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">1.0005</span><span class="p">,</span><span class="w"> </span><span class="m">1.5</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">1.5005</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">2.0005</span><span class="p">,</span><span class="w"> </span><span class="m">2.5</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">2.5005</span><span class="p">,</span><span class="w"> </span><span class="m">3</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">3.0005</span><span class="p">,</span><span class="w"> </span><span class="m">3.5</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">3.5005</span><span class="p">,</span><span class="w"> </span><span class="m">4.0</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">6.5005</span><span class="p">,</span><span class="w"> </span><span class="m">7</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">7.0005</span><span class="p">,</span><span class="w"> </span><span class="m">7.5</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">7.5005</span><span class="p">,</span><span class="w"> </span><span class="m">8</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">8.0005</span><span class="p">,</span><span class="w"> </span><span class="m">8.5</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">8.5005</span><span class="p">,</span><span class="w"> </span><span class="m">9</span><span class="p">),</span><span class="w">
</span><span class="nf">c</span><span class="p">(</span><span class="m">9.0005</span><span class="p">,</span><span class="w"> </span><span class="m">9.49</span><span class="p">))</span></code></pre></figure>
<p>For very large datasets (typically file size of > 0.5 GB), loading of the dataset into the interactive visualisation web browser may become impractical. Thus, , <a href="https://github.com/jwist/hastaLaVista"><code class="language-plaintext highlighter-rouge">hastaLaVista</code></a> <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> can be configured to retrieve spectra only when necessary. To do this, uncomment lines 45-53 and define a path to store the JSON. This will convert spectral data into an individual JSON file, thus enabling fast and efficient interaction of the browser. However, to do this, you must locate the folder where <a href="https://github.com/jwist/hastaLaVista"><code class="language-plaintext highlighter-rouge">hastaLaVista</code></a> <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> is stored in your installation. This is achieved using the <code class="language-plaintext highlighter-rouge">.libPathts()</code> command and then creating the folder <code class="language-plaintext highlighter-rouge">visu/data/json</code>. Line 46 of the demo file must point to this json folder.
Running the script will point your default browser on the URL.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>http://127.0.0.1:5474/?viewURL=http://127.0.0.1:5474/view/modelExplorer_1_1.view.json&dataURL=http://127.0.0.1:5474/data/multiblocking.data.json
</code></pre></div></div>
<p>and you should see the following page where you can explore the scores and loadings for each of the block PCA models, the results for the whole PCA, together with the super scores and superloadings that are reconstructed based on combining all the scores and loadings for all the blocks. Using Alt-click it is possible to select scores and then select spectra that the user wants to display.</p>
<p>This visualisation tool allows the user to explore the dataset with a high level of granularity to define the features of interest.</p>
<p><img src="/hastaLaVista/assets/model-explorer.gif" alt="drawing" width="800px" /></p>
<h2 id="feature-identification-and-selection">Feature identification and selection</h2>
<p>A useful feature of the vista is to allow the user to select spectra from the scores plot and the use these data to compute a Statistical TOtal Correlation SpectroscopY (<a href="http://dx.doi.org/10.1021/ac048630x">STOCSY</a>) model which finds all the correlated peaks relating to the feature of interest. The driver peak used to compute STOCSY is simply set by Alt-click on the selected spectra, on the corresponding variable (see video below). A STOCSY trace will appear in the middle gray area that allows the user to identify features that belong to the same molecule.</p>
<p><img src="/hastaLaVista/assets/stocsy.gif" alt="drawing" width="800px" /></p>
<h2 id="computing-cross-correlation">Computing cross-correlation</h2>
<p>In the github.com/cheminfo/COMPASS repository, a demo <a href="https://github.com/cheminfo/COMPASS/blob/master/MBCC-metaboliteX.Rmd"><em>MBCC-metaboliteX.Rmd</em></a> file is provided to work with the provided demo <a href="https://doi.org/10.3389/fmicb.2011.00183">dataset</a>.</p>
<p>The following code chunk from the demo file (line 49 – 57) allows the user to select the feature that will be used as a reference. In this example we selected two features, a triplet between 2.42 ppm to 2.47 ppm; and a doublet between 1.47 ppm to 1.50 ppm. Both features are well resolved in the first sample (ID = 1). i.e. they are used as reference features to compute cross-correlation.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">patternID</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">identifier</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ID</span><span class="w">
</span><span class="n">L</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">length</span><span class="p">(</span><span class="n">ID</span><span class="p">)</span><span class="w">
</span><span class="c1"># several ranges may be defined and combined to create a pattern</span><span class="w">
</span><span class="c1"># define here the range that should be used</span><span class="w">
</span><span class="nb">F</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">x_axis</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="m">2.42</span><span class="w"> </span><span class="o">&</span><span class="w"> </span><span class="n">x_axis</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="m">2.47</span><span class="p">)</span><span class="w">
</span><span class="n">F2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">x_axis</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="m">1.47</span><span class="w"> </span><span class="o">&</span><span class="w"> </span><span class="n">x_axis</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="m">1.5</span><span class="p">)</span><span class="w">
</span><span class="n">rangeList</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="nb">F</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span><span class="p">)</span></code></pre></figure>
<p>The reference features are shown here:</p>
<p><img src="/hastaLaVista/assets/patternX_4117.png" alt="drawing" width="800px" />
<img src="/hastaLaVista/assets/patternX_4910.png" alt="drawing" width="800px" /></p>
<p>And finally we can compute and display the cross-correlation using the following code chunk (line 89 – 125 in the demo file).</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># define here what information should be used as a category or a group</span><span class="w">
</span><span class="n">colorCode</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">metadata</span><span class="o">$</span><span class="n">Class</span><span class="w">
</span><span class="n">ccList</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">()</span><span class="w">
</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="nf">seq_along</span><span class="p">(</span><span class="n">rangeList</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nb">F</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rangeList</span><span class="p">[[</span><span class="n">i</span><span class="p">]]</span><span class="w">
</span><span class="n">subX</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">X</span><span class="p">[,</span><span class="w"> </span><span class="nb">F</span><span class="p">]</span><span class="w">
</span><span class="n">pattern</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">subX</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">identifier</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">patternID</span><span class="p">),])</span><span class="w">
</span><span class="n">pattern</span><span class="p">[</span><span class="n">pattern</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="m">0</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1e-04</span><span class="w">
</span><span class="n">dil.F</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="o">/</span><span class="n">apply</span><span class="p">(</span><span class="n">subX</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="n">median</span><span class="p">(</span><span class="n">x</span><span class="o">/</span><span class="n">pattern</span><span class="p">))</span><span class="w">
</span><span class="n">subX.scaled</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">subX</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">dil.F</span><span class="w">
</span><span class="n">res</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">apply</span><span class="p">(</span><span class="n">subX.scaled</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="n">ccf</span><span class="p">(</span><span class="n">pattern</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">lag.max</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="n">plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"correlation"</span><span class="p">))</span><span class="w">
</span><span class="p">})</span><span class="w">
</span><span class="n">r</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">unlist</span><span class="p">(</span><span class="n">lapply</span><span class="p">(</span><span class="n">res</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="nf">max</span><span class="p">(</span><span class="n">x</span><span class="o">$</span><span class="n">acf</span><span class="p">)))</span><span class="w">
</span><span class="n">par</span><span class="p">(</span><span class="n">mar</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="m">4</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">4</span><span class="p">))</span><span class="w">
</span><span class="n">plot</span><span class="p">(</span><span class="n">r</span><span class="p">,</span><span class="w">
</span><span class="n">main</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">paste0</span><span class="p">(</span><span class="n">metabolite</span><span class="p">,</span><span class="w"> </span><span class="s2">": cc colored by country (pattern: "</span><span class="p">,</span><span class="w"> </span><span class="n">patternID</span><span class="p">,</span><span class="w"> </span><span class="s2">")"</span><span class="p">),</span><span class="w">
</span><span class="n">xlab</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"sample index"</span><span class="p">,</span><span class="w">
</span><span class="n">ylab</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"cross-correlation"</span><span class="p">,</span><span class="w">
</span><span class="n">cex.main</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w">
</span><span class="n">cex.lab</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w">
</span><span class="n">cex.axis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w">
</span><span class="n">cex</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w">
</span><span class="n">col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">colorCode</span><span class="p">)</span><span class="w">
</span><span class="n">ccList</span><span class="p">[[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">unname</span><span class="p">(</span><span class="n">r</span><span class="p">)</span><span class="w">
</span><span class="p">}</span></code></pre></figure>
<p>The results of the cross-correlation for the triplet at 2.445 and doublet at 1.484 are plotted in figures below and color coded according to class:</p>
<p><img src="/hastaLaVista/assets/ccX_1.png" alt="drawing" width="800px" />
<img src="/hastaLaVista/assets/ccX_2.png" alt="drawing" width="800px" /></p>
<h2 id="definition-of-the-cross-correlation-limits-to-select-features">Definition of the cross-correlation limits to select features</h2>
<p>These figures enable the user to define the level of cross-correlation. Although cross-correlation is insensitive to chemical shifts, it is however sensitive to distortions of the pattern due to overlap. Therefore it is necessary to visually inspect the results. Since this inspection is mandatory and cumbersome, we have made efforts to provide the best interface to quickly visually check the results.</p>
<p>The 2 thresholds are defined on line 137 and 139 of the script:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="c1"># define here the threshold for selection</span><span class="w">
</span><span class="n">selectThreshold</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0.8</span><span class="w">
</span><span class="c1"># define a threshold for coloring. The samples with cross-correlation below this level (lower confidence) will be displayed with red color later.</span><span class="w">
</span><span class="n">colorThreshold</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0.9</span></code></pre></figure>
<p>In the upper figure (triplet pattern) a cross-correlation threshold may be easily defined at > 0.8. In the second case (doublet), the cross-correlation threshold is somewhat ambiguous. It is recommended that we choose to combine several (as many as possible) patterns and compute the combined cross-correlation, since the more complete the pattern, the better-defined the distribution of cross correlation will be. However, the noise may be another factor that impacts the cross-correlation level. For high-intensity signals, the cross-correlation values are usually higher than those signals close to the noise.</p>
<h2 id="visual-inspection-of-the-results">Visual inspection of the results</h2>
<p>** The visual inspection is a mandatory step in the proposed method that is not intended to be fully automated!**</p>
<p>To visualise the results, the user has two options, printing the results using the script (lines 145 to 181), or use <a href="https://github.com/jwist/hastaLaVista"><code class="language-plaintext highlighter-rouge">hastaLaVista</code></a> <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> (lines 213 to 240). In the first approach, the cross correlation (CC) thresholds are defined in the demo script (line 137 and 139). The feature region will thus be highlighted with green dots for spectra with CC above the hightest threshold, orange where the CC is between the two limits and red below. This <strong>traffic light</strong> system allows the user to quickly check the results.</p>
<h3 id="inspection-using-static-images">Inspection using static images</h3>
<p>Examples of figures generated using the first approach are shown below for the triplet at 2.445 ppm.</p>
<p><img src="/hastaLaVista/assets/greenX_37.png" alt="drawing" width="800px" />
<img src="/hastaLaVista/assets/greenX_38.png" alt="drawing" width="800px" />
<img src="/hastaLaVista/assets/orangeX_40.png" alt="drawing" width="800px" />
<img src="/hastaLaVista/assets/orangeX_44.png" alt="drawing" width="800px" /></p>
<p>As expected the cross-correlation is not sensitive to the shift (second trace) of the triplet but is sensitive to overlap (top trace). Last two traces show CC for cases where no signal is present.</p>
<h3 id="interactive-inspection-of-the-results">Interactive inspection of the results</h3>
<p>A drawback of this approach is that the script has to be run each time a different threshold is defined. Since the threshold has to be accurately set and results thoroughly checked afterwards in order to produce faithful results, we provide a more interactive tool that enable interactive exploration of the results.</p>
<p>The second option makes use of the <a href="https://github.com/jwist/hastaLaVista"><code class="language-plaintext highlighter-rouge">hastaLaVista</code></a> <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> package to make the visualization interactive, as shown here:</p>
<p><img src="/hastaLaVista/assets/cross-corr_explorer0.1.gif" alt="drawing" width="800px" /></p>
<p>This visualisation tools enable the user to easily observe the resulting selection of CC threshold by moving a slider, and then to rapidly review the selected features over the whole dataset by simply choosing from the table.</p>
<p>For this example, by choosing a threshold at 0.8 the triplet feature is found in all 19 samples pre operation, while only found in 71% (10 out of 14) of the samples after 2 weeks, 46% (6 out of 13) after 6 weeks and 62% (8 out of 13) after 8 weeks.</p>We have recently published a strategy for improving human metabolic phenotyping using Combined Multiblock Principal components Analysis with Statistical Spectroscopy (COMPASS). The COMPASS approach is developed within R environment. The open access manuscript can be found here.interactive visualization with R2019-11-28T01:06:34+00:002019-11-28T01:06:34+00:00https://jwist.github.io/hastaLaVista/r/2019/11/28/interactive-visualization-with-R<p>Interactive data visualization is a must to develop attractive tools for a broad audience. Here I will show how to use a JavaScript framework to visualize results computed with R. I think it is a good idea to keep computation separated from visualization to make more robust pipelines, hence the idea to use a webpage as a visualization platform.</p>
<p><a href="https://github.com/npellet/visualizer"><code class="language-plaintext highlighter-rouge">visualizer</code></a> is a webpage (a tool) that takes data as input and display them according to a customizable layout (a view, or vista in spanish). <a href="https://github.com/npellet/visualizer"><code class="language-plaintext highlighter-rouge">visualizer</code></a> allows to define modules that can display many different types of data and that can be chained to build complex pipelines. Since this package is build with pure JavaScript, code can be added to modules to allow even more complex manipulation of the results.</p>
<p><a href="https://github.com/npellet/visualizer"><code class="language-plaintext highlighter-rouge">visualizer</code></a> needs two files, a data.json file that contains the data or result to be displayed in json format and a view.file that contains the description of how to display the data.</p>
<p>Both those files can be produced by an R script and pushed to the webpage. This is what <a href="https://github.com/jwist/hastaLaVista"><code class="language-plaintext highlighter-rouge">hastaLaVista</code></a> <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> R-package.</p>
<h3 id="getting-started">getting started</h3>
<p>First install the latest release of <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> using devtools.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">devtools</span><span class="o">::</span><span class="n">install_github</span><span class="p">(</span><span class="s2">"jwist/hastaLaVista"</span><span class="p">)</span></code></pre></figure>
<h3 id="check-your-install">check your install</h3>
<p>You can check that installation has been successful by loading demo files.</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">library</span><span class="p">(</span><span class="n">hastaLaVista</span><span class="p">)</span><span class="w">
</span><span class="n">v</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">new</span><span class="p">(</span><span class="s2">"visualization"</span><span class="p">)</span><span class="w">
</span><span class="n">v</span><span class="o">@</span><span class="n">data</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"test.data.json"</span><span class="w">
</span><span class="n">v</span><span class="o">@</span><span class="n">view</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"test.view.json"</span><span class="w">
</span><span class="n">visualize</span><span class="p">(</span><span class="n">v</span><span class="p">)</span></code></pre></figure>
<p>In this case, no computation is performed, but R will push two files test.data.json and test.view.json that are available with the package. In case of success, you should see this: <img src="/hastaLaVista/assets/hlv-test.gif" alt="drawing" width="700px" /></p>
<p>Beware that it may take some while to load the first time. <strong>Be patient!</strong></p>
<p>Hovering over the data points in the plot will display information about them. This is a first example of interactive display.</p>
<h2 id="how-does-it-works">how does it works?</h2>
<p>as a simple example we can display a sine and a cosine function. With the result we create a data.frame structure that contains the following elements: x, y, _highlight and info. It is important to respect these names, since they will be used to create JSON object that will be read and interpreted by the vista (view.json).</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="n">from</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">pi</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">)</span><span class="w">
</span><span class="n">chart1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="s2">"x"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">,</span><span class="w">
</span><span class="s2">"y"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">cos</span><span class="p">(</span><span class="m">3</span><span class="o">*</span><span class="n">x</span><span class="p">),</span><span class="w">
</span><span class="s2">"_highlight"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">seq_along</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">
</span><span class="s2">"info"</span><span class="o">=</span><span class="w"> </span><span class="n">paste0</span><span class="p">(</span><span class="s2">"cosID: "</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="o">:</span><span class="m">31</span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w">
</span><span class="n">chart2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="s2">"x"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">,</span><span class="w">
</span><span class="s2">"y"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sin</span><span class="p">(</span><span class="m">3</span><span class="o">*</span><span class="n">x</span><span class="p">),</span><span class="w">
</span><span class="s2">"_highlight"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">seq_along</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">
</span><span class="s2">"info"</span><span class="o">=</span><span class="w"> </span><span class="n">paste0</span><span class="p">(</span><span class="s2">"sinID: "</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="o">:</span><span class="m">31</span><span class="p">)</span><span class="w">
</span><span class="p">)</span></code></pre></figure>
<p>Each data.frame will be added to a list structure. This list is later converted into a single JSON object that contains all the variables needed for display. Each variable will be converted into an object in the JSON file (data.json).</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="w"> </span><span class="n">chart</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">chart</span><span class="o">=</span><span class="n">chart1</span><span class="p">),</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">chart</span><span class="o">=</span><span class="n">chart2</span><span class="p">))</span><span class="w">
</span><span class="n">d</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">appendData</span><span class="p">(</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">d</span><span class="p">,</span><span class="w"> </span><span class="n">variableName</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"multiChart"</span><span class="p">,</span><span class="w"> </span><span class="n">variable</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">chart</span><span class="p">,</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"multiChart"</span><span class="p">)</span></code></pre></figure>
<p>The above command will add the data.frames “chart” to the list with name “multiChart”. The <code class="language-plaintext highlighter-rouge">type = "multiChart"</code> ensures that the information in the data.frame is converted into a chart object to be interpreted by the module “spectra displayer” of the <em>visualizer</em> package.</p>
<p>Once the results are all stored into the list structure, a visualization object is created as described below:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">new</span><span class="p">(</span><span class="s1">'visualization'</span><span class="p">)</span><span class="w">
</span><span class="n">v</span><span class="o">@</span><span class="n">data</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"test.data.json"</span><span class="w">
</span><span class="n">v</span><span class="o">@</span><span class="n">view</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"test.view.json"</span><span class="w">
</span><span class="n">push</span><span class="p">(</span><span class="n">v</span><span class="p">,</span><span class="w"> </span><span class="s1">'data'</span><span class="p">,</span><span class="w"> </span><span class="n">d</span><span class="p">)</span><span class="w">
</span><span class="n">visualize</span><span class="p">(</span><span class="n">v</span><span class="p">)</span></code></pre></figure>
<p>The first line create the object, while the second and third lines defines names for the files to be served. The file “test.data.json” will be created with the <code class="language-plaintext highlighter-rouge">push(v, 'data', d)</code> command. The <code class="language-plaintext highlighter-rouge">v@view</code> allows to tell <img src="/hastaLaVista/assets/hlvLogo50px.png" alt="drawing" width="50px" /> what <em>vista</em> to use. This file must exist.</p>
<p>The last line <code class="language-plaintext highlighter-rouge">visualize(v)</code> will start a webserver (based on <em>servr</em> package) and point your default browser to the correct URL.</p>
<p>The command <code class="language-plaintext highlighter-rouge">print(v)</code> allows to print that URL for later use, to avoid having to recompute the analysis.</p>
<p>A closer look to this URL shows it is composed of three parts:</p>
<figure class="highlight"><pre><code class="language-js" data-lang="js"><span class="nx">http</span><span class="p">:</span><span class="c1">//127.0.0.1:5474/?viewURL=http://127.0.0.1:5474/view/test.view.json&dataURL=http://127.0.0.1:5474/data/test.data.json</span></code></pre></figure>
<p>The first part <code class="language-plaintext highlighter-rouge">http://127.0.0.1:5474/</code> must point to the <em>visualizer</em> root directory. This folder <em>visu</em> is placed inside the package system folder that can be found using the command <code class="language-plaintext highlighter-rouge">path.package("hastaLaVista")</code>. The webserver will use this folder rootDirectory/visu/ to serve information, which means that all the files that are placed in there could be served.</p>
<p>The second part <code class="language-plaintext highlighter-rouge">?viewURL=http://127.0.0.1:5474/view/test.view.json</code> is a parameter passed to the <em>visualizer</em> and it tells where to find the <em>vista</em>. In this case the file is to be found in rootDirectory/visu/view/</p>
<p>The third part <code class="language-plaintext highlighter-rouge">dataURL=http://127.0.0.1:5474/data/test.data.json</code> tells where to find the data. In this case in rootDirectory/visu/data/</p>Interactive data visualization is a must to develop attractive tools for a broad audience. Here I will show how to use a JavaScript framework to visualize results computed with R. I think it is a good idea to keep computation separated from visualization to make more robust pipelines, hence the idea to use a webpage as a visualization platform.