Notebook on nbviewer (original) (raw)
Visualizing data from Daru containers¶
DARU (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data. You can find information about daru in its repository.
GnuplotRB takes from Daru::Vector or Daru::DataFrame name as dataset's title and index column as xtic. Example:
In [1]:
require 'daru' require 'gnuplotrb' include GnuplotRB include GnuplotRB::Fit
df = Daru::DataFrame.new({ Build: [312, 630, 315, 312], Test: [525, 1050, 701, 514], Deploy: [215, 441, 370, 220] }, index: ['Run A', 'Run B', 'Run C', 'Run D'] ) df[:Overall] = df[:Build] + df[:Test] + df[:Deploy] df
Out[1]:
Daru::DataFrame:33875020 rows: 4 cols: 4 | ||||
---|---|---|---|---|
Build | Deploy | Test | Overall | |
Run A | 312 | 215 | 525 | 1052 |
Run B | 630 | 441 | 1050 | 2121 |
Run C | 315 | 370 | 701 | 1386 |
Run D | 312 | 220 | 514 | 1046 |
When you pass DataFrame to Plot.new it uses every column of DataFrame as a dataset with column name as dataset title:
In [2]:
from_daru = Plot.new( df, style_data: 'lines', yrange: 0..2200, xlabel: 'Number of test', ylabel: 'Time, s', title: 'Time spent to run deploy pipeline' )
Out[2]:
Gnuplot Produced by GNUPLOT 5.0 patchlevel rc2 0 500 1000 1500 2000 Run A Run B Run C Run D Time, s Number of test Time spent to run deploy pipeline Build Build Deploy Deploy Test Test Overall Overall
In [3]:
from_daru.options( style_data: 'histograms', style_fill: 'pattern border' )
Out[3]:
Gnuplot Produced by GNUPLOT 5.0 patchlevel rc2 0 500 1000 1500 2000 Run A Run B Run C Run D Time, s Number of test Time spent to run deploy pipeline Build Build Deploy Deploy Test Test Overall Overall
Datasets may be initialized both with Array or DataFrame:
In [4]:
Plot.new([df[:Overall], with: 'lines'])
Out[4]:
Gnuplot Produced by GNUPLOT 5.0 patchlevel rc2 1000 1200 1400 1600 1800 2000 2200 Run A Run B Run C Run D Overall Overall
In [5]:
rows = (1..30).map do |i| [i**2 * (rand(4) + 3) / 5, rand(70)] end df = Daru::DataFrame.rows(rows, order: [:Value, :Error], name: 'Confidence interval')
random_points = Plot.new( [df[:Value], with: 'lines', title: 'Average value'], [df, with: 'err'] )
Out[5]:
Gnuplot Produced by GNUPLOT 5.0 patchlevel rc2 -100 0 100 200 300 400 500 600 700 800 900 0 5 10 15 20 25 30 Average value Average value Confidence interval Confidence interval
ok, and now lets try to fit it with polynomial:
In [6]:
poly = fit_poly(df, degree: 5) random_points.add_dataset(poly[:formula_ds])
Out[6]:
Gnuplot Produced by GNUPLOT 5.0 patchlevel rc2 -100 0 100 200 300 400 500 600 700 800 900 0 5 10 15 20 25 30 Fit formula Fit formula Average value Average value Confidence interval Confidence interval
In [7]:
df = Daru::DataFrame.new({ a: Array.new(100) {|i| i}, b: 100.times.map{rand} }, name: 'Scatter example' )
Plot.new([df, pt: 6, ps: 1, using: '2:3'], xrange: -10..110, yrange: -0.1..1.1)
Out[7]:
Gnuplot Produced by GNUPLOT 5.0 patchlevel rc2 0 0.2 0.4 0.6 0.8 1 0 20 40 60 80 100 Scatter example Scatter example
In [8]:
frames = 100.times.map do |i| Plot.new([df.row[0..i], using: '2:3', pt: 6, ps: 1]) end
Animation.new(*frames, xrange: -10..110, yrange: -0.1..1.1)
Out[8]: