Creating Basic Graphs

⭐️Learning Goal(s)

In this tutorial, you will make a basic point or ‘dot’ plot (along with some optional customizations) of the fruit-diameter dataset below. You will become familiar with the following components:

To manage our data, we will use the DataContainer library. In addition, the concept of scaling will be introduced through d3 scales.

Throughout the tutorial, we will write our visualization is Svelte’s template syntax. If you are unfamiliar with Svelte, we suggest to take a look at the first sections of the general Svelte tutorial first.

Dataset

fruit diameter fruit diameter
lime 4.7 lime 4.7
lemon 6.1 grapefruit 8.8
grapefruit 7.9 pomelo 12.5
lemon 6.6 grapefruit 12.7
orange 6.7 grapefruit 8.6
lemon 5.3 pomelo 13.1
pomelo 11.6 lime 5.8
grapefruit 11.1 orange 8.9
lime 5.5 grapefruit 9.1
pomelo 10.6 pomelo 10.3
lemon 6.4 grapefruit 9.4

Loading the dataset

When working with florence, in most cases it is recommended to use the sidecar florence-datacontainer library, which exports a DataContainer class. This DataContainer allows for reading and manipulating data sets in way that is consistent with the dplyr approach from R’s tidyverse.

The two most common data formats that can be loaded to the DataContainer are column-oriented and row-oriented data.

Row-oriented data takes the form of an array of objects. For example, in row-oriented format, the dataset above would look something like this:

[
  { fruit: 'lime', diameter: 4.7 },
  { fruit: 'lemon', diameter: 6.1 }
  ...
]

Data in column-oriented format, on the other hand, takes the form of a single object of columns represented as Arrays, like so:

{
  fruit: ['lime', 'lemon', ...],
  diameter: [4.7, 6.1, ...]
}

Passing in data by column is the preferred format as it reduces the amount of pre-processing needed. The dataset above in column-oriented format looks as follows:

<script>
  const data = {
    diameter: [
      4.7, 6.1, 7.9, 6.6, 6.7, 5.3, 11.6, 11.1, 5.5, 10.6, 6.4,
      4.9, 8.8, 12.5, 12.7, 8.6, 13.1, 5.8, 8.9, 9.1, 10.3, 9.4, 10.1
    ],
  
    fruit: [
      'lime', 'lemon', 'grapefruit', 'lemon', 'orange', 'lemon', 
      'pomelo', 'grapefruit', 'lime', 'pomelo', 'lemon',
      'lime', 'grapefruit', 'pomelo', 'grapefruit', 'grapefruit',
      'pomelo', 'lime', 'orange', 'grapefruit', 'pomelo', 
      'grapefruit', 'anchovies'
    ]
  }
</script>

We will load this data into a DataContainer. Somehow some anchovies ended up in our fruit data! Let’s filter them out immediately. Filter is one of the data transformation methods made available by DataContainer.

<script>
  import DataContainer from '@snlab/florence-datacontainer'

  const data = {
    diameter: [
      4.7, 6.1, 7.9, 6.6, 6.7, 5.3, 11.6, 11.1, 5.5, 10.6, 6.4,
      4.9, 8.8, 12.5, 12.7, 8.6, 13.1, 5.8, 8.9, 9.1, 10.3, 9.4, 0.5
    ],
  
    fruit: [
      'lime', 'lemon', 'grapefruit', 'lemon', 'orange', 'lemon', 
      'pomelo', 'grapefruit', 'lime', 'pomelo', 'lemon',
      'lime', 'grapefruit', 'pomelo', 'grapefruit', 'grapefruit',
      'pomelo', 'lime', 'orange', 'grapefruit', 'pomelo', 
      'grapefruit', 'anchovies'
    ]
  }

  const dataContainer = new DataContainer(data)
    .filter(row => row.fruit !== 'anchovies')
</script>

For further details on using DataContainer, refer to its documentation.

Graphic

The Graphic component forms the basis of all Florence graphics, including ours. We will give our graphic a width and height of 500 pixels:

<script>
  import { Graphic } from '@snlab/florence'
  ...
</script>

<Graphic width={500} height={500}>

</Graphic>

Section

The Section component divides the Graphic (or canvas) up into subsections. Each subsection can contain its own graph with its own marks, axes etc. While this could also be accomplished by creating multiple Graphics, it is often more convenient to use multiple Sections inside a single Graphic, which allows you to use florences coordinate systems to position the various subsections of the graph (whereas with multiple Graphics you would have to do this with HTML and CSS). Another reason for using Sections is when you want to use multiple types of coordinate systems within a single graphic, which is the main reason for using a Section in this tutorial.

We will declare a single Section with a bit of padding like so:

<script>
  import { Graphic, Section } from '@snlab/florence'
  import DataContainer from '@snlab/florence-datacontainer'

  const data = {
    ...
  }

  const dataContainer = ...
</script>

<Graphic width={500} height={500}>

  <Section padding={40}>

  </Section>

</Graphic>

For further details on Section, refer to its documentation.

Scaling coordinates

In our example, let’s say we’d like to plot the column fruit to the x-axis, and diameter to the y-axis. To accomplish this, we will have to somehow convert our data values, which are fruit names and diameters, into x and y pixel coordinates that will be displayed on our screens. This is what scales are used for. We will create our scales in the <script> section of the component:

<script>
  import { Graphic, Section } from '@snlab/florence'
  import DataContainer from '@snlab/florence-datacontainer'
  import { scalePoint, scaleLinear } from 'd3-scale'

  const data = {
    ...
  }

  const dataContainer = ...

  const fruitDomain = dataContainer.domain('fruit')
  const scaleFruit = scalePoint().domain(fruitDomain).padding(0.2)
  const diameterDomain = [0, dataContainer.max('diameter')]
  const scaleDiameter = scaleLinear().domain(diameterDomain)
</script>

We then pass our scales to the scaleX and scaleY props of the Section:

<Graphic width={500} height={500}>

  <Section
    padding={40}
    scaleX={scaleFruit}
    scaleY={scaleDiameter}
  >

  </Section>

</Graphic>

For further information on d3-scales, refer to the documentation.

Marks

In order to actually render one point per row in the dataset, we need to specify our chosen mark. Since we are going to draw multiple points, it makes sense to use the PointLayer component. We could, however, also have chosen another type of Layer. Refer to the Marks section of the documentation for more options.

<script>
  import { Graphic, Section, PointLayer } from '@snlab/florence'
  ...
</script>

<Graphic width={500} height={500}>

  <Section
    padding={40}
    scaleX={scaleFruit}
    scaleY={scaleDiameter}
  >

    <PointLayer
      x={dataContainer.column('fruit')}
      y={dataContainer.column('diameter')}
    />

  </Section>

</Graphic>

If we were plotting only a single point, it would have made more sense to use the Point mark instead of the PointLayer.

Note that the PointLayer component requires the x and y properties to be specified. This tells the component which data variable to use as the x and y coordinates of each point. Since we want our fruits on the x-axis and our diameters on the y-axis, we specify x and y as data.column('fruit') and data.column('diameter'), respectively.

Since we have already specified the scale values in Section, our data values will be be scaled to screen coordinates.

The graph should now look like this:

Axes

The points look fine, but they lack context. In order to understand what the points are supposed to represent, we will add some axes. Axes are added using the XAxis and YAxis components. These components will automatically adapt to the scales specified by their parent Section component, and revert to smart defaults in their absence.

<script>
  import { Graphic, Section, PointLayer, XAxis, YAxis } from '@snlab/florence'
  ...
</script>

<Graphic width={500} height={500}>

  <Section
    padding={40}
    scaleX={scaleFruit}
    scaleY={scaleDiameter}
  >

    <PointLayer
      x={data.column('fruit')}
      y={data.column('diameter')}
    />

    <XAxis title={'fruit'} />
    <YAxis title={'diameter (cm)'} />

  </Section>

</Graphic>

The graphic should now look like this:

limelemongrapefruitorangepomelo fruit 012345678910111213 diameter (cm)

Whoops! The graphic looks like it’s upside down, because in your browser, y coordinates go from up to down instead of down to up. Better flip the y dimension of our Section’s coordinate system:

<Section
  padding={40}
  scaleX={scaleFruit}
  scaleY={scaleDiameter}
  flipY
>
limelemongrapefruitorangepomelo fruit 012345678910111213 diameter (cm)

Title

Graphics usually come with titles describing their content. Titles can be added using the Label component. Note that we place the Label outside of the Section, so that we can use the Graphic’s coordinate system (which is in ‘pixel space’ rather than ‘data space’) to position it. This is just to show the two different coordinate systems that currently exist within out graphic- we could also have used the coordinate system inside of the Section.

<script>
  import { Graphic, Section, PointLayer, XAxis, YAxis, Label } from '@snlab/florence'
  ...
</script>

<Graphic width={500} height={500}>

  <Section
    padding={40}
    scaleX={scaleFruit}
    scaleY={scaleDiameter}
    flip
  >

    ...

  </Section>

  <Label
    x={0.5}
    y={0.14}
    text={'Fruit Sizes'}
    fontFamily={'Baskerville'}
    fontSize={18}
  />

</Graphic>
limelemongrapefruitorangepomelo fruit 012345678910111213 diameter (cm) Fruit Sizes

Scaling other aesthetics

So far, we have used scales to convert raw values in the dataset (e.g. fruit names) into values that can be rendered on a screen (pixel coordinates). In technical terms, we could say that in the examples above, we used scales to map values in the domain of fruit names to values in the range of pixel coordinates in the x dimension, and values in the domain of fruit diameters to values in the range of pixel coordinates in the y dimension. Whew!

However, scaling is not limited to changing data values (domain) into screen coordinates/pixels (range). Scaling can be done for a wide range of aesthetics, such as color, radius and opacity. For example, when we use a scale to give each type of fruit its own color, we are mapping values in the domain of fruit names to a range of colors. In this tutorial, we will use a scale to give each type of fruit a different color, and another scale to make fruits with higher diameters look bigger. We will use the fill and radius props on the PointLayer for this.

<script>
  ...
  import { scalePoint, scaleLinear, scaleOrdinal } from 'd3-scale'
  import { schemeCategory10 } from 'd3-scale-chromatic'
  ...

  const scaleFruitColor = scaleOrdinal()
    .domain(fruitDomain)
    .range(schemeCategory10)

  const scaleRadius = scaleLinear()
    .domain(diameterDomain)
    .range([2, 10])
</script>

<Graphic width={500} height={500}>

  <Section
    padding={40}
    scaleX={scaleFruit}
    scaleY={scaleDiameter}
  >

    <PointLayer
      x={dataContainer.column('fruit')}
      y={dataContainer.column('diameter')}
      fill={dataContainer.map('fruit', scaleFruitColor)}
      radius={dataContainer.map('diameter', scaleRadius)}
    />

    ...

  </Section>

  ...

</Graphic>
limelemongrapefruitorangepomelo fruit 012345678910111213 diameter (cm) Fruit Sizes

Note that the scaleFruitColor and scaleRadius not only have a .domain, but also a .range. What ‘domains’ and ‘ranges’ are was explained under Scaling coordinates. So why do we now suddenly have to specify both the domain and the range, while for the scaleFruit and scaleDiameter we only had to specify the domain? The answer is that florence takes care of the range when you pass your scales to scaleX and scaleY. But for scales that do not relate to position, we have to define a range ourselves.

For scaleFruitColor we are using an ‘ordinal’ scale, which means that our domains and ranges are both categories (fruit names and colors). What about the schemeCategory10? This simply tells the scale to use the ‘category 10’ color scheme. This is one of many predefined color schemes from d3. However, do note that certain color schemes can only be used with categorical variables/certain scale types (like we are doing here), while others can only be used with numerical variables.

Finally, what is going on with the .map('fruit', scaleFruitColor) part, you might ask? This simply means “use scaleFruitColor to convert every fruit in the fruit column to a color in schemeCategory10”. Likewise, .map('diameter', scaleRadius) means “convert every fruit diameter into a radius expressed in pixel values”.

Result