For loop
Using loop with R is not the proper behaviour to have. Indeed,
using the apply functions family such as tapply, lapply, and so
one are a better way to do it, but for me, this is in some case too
complicated.
The apply family functions improve the speed rate of T and the management of the memory. Thus, apply functions is relevant to use when you have big data-set to play with.
Instead of using apply functions, I used for
loop. This is simpler to understand it for me and with moderate data-set (about few thousand of rows and 100 columns) the gain of apply functions is limited.
Like this, for
loop can be read again quickly even when you comeback to your code years later.
I will now provide a small example to understand when you can use for
loop and how to used them. Also, note that understanding for
loop will lead to use foreach
function which allow parallelisation computing. It will be explain in another post.
So, let's begin the for
loop example.
I will show you how to do spaghetti plot from tree ring data.
About the data
Every year, tree make rings. We can count them to have the tree age, and also measure them to know a little more about the tree and its story. Indeed, tree increment (ring width from a year to another) give information on past climate, past disturbances (pest, drought, fire, ice and wind storms, etc.).
In this data set(tree rings increment) you will find tree rings increment [in mm] from 19 different trees (V1 to V19). The NA value is for year without increment information for this tree (it was not born), and the row #270 is for AD 2011 and the row #1 for AD 1741.
The data look like this:
data.table 1.10.0 The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way Documentation: ?data.table, example(data.table) and browseVignettes("data.table") Release notes, videos and slides: http://r-datatable.com V1 V2 V3 1: NA NA NA 2: NA NA NA 3: NA NA NA 4: NA NA NA 5: NA NA NA --- 266: 128.99 182.54 66.89 267: 130.45 182.80 67.14 268: 131.54 183.24 67.50 269: 132.25 183.48 67.73 270: 132.76 184.34 68.14
We are now going to plot each tree starting from their first year of growth record (i.e. the closest year to the row #1), and then continue until the year AD 2011 (i.e. row #270). Because in data-set there is no year with 0 of increment, we need to add it for each trees using the following code example for one tree:
V1 1: 0.00 2: 0.71 3: 1.64 4: 2.07 5: 2.46 --- 137: 128.99 138: 130.45 139: 131.54 140: 132.25 141: 132.76
Here we can see that we remove all NA value, and put a 0.00 value for the first year of growth of the tree
Because we need to do it for each 19 trees, this is easier to do it
within a for
loop like this:
If you want, I have put the all code in my GitHub with the data-set Dendro-spaghetti-plot