|
@@ -180,3 +180,25 @@ python agg.py -i ../data/test1kb.pkl -c ../data/test1kbclustertable.pkl -o ../da
|
180
|
180
|
```
|
181
|
181
|
|
182
|
182
|
Downloads new dataset from the `public.icp_sample_18m` sample and saves it to `../data/test1kb.pkl`. Then assigns clusters to this (excluding the misc/'-1' cluster) from the `../data/test1kagg.pkl` dataset with threshold 0.1, saving into `../data/test1kbclustertable.pkl`. Then aggregates this dataset and saves in `../data/test1kbagg.pkl`.
|
|
183
|
+
|
|
184
|
+## R
|
|
185
|
+
|
|
186
|
+The scripts in `R/` include visualisers for the data, and for the creation of some models.
|
|
187
|
+
|
|
188
|
+### Installing R
|
|
189
|
+
|
|
190
|
+To install R on an ec2 instance you need to run the following:
|
|
191
|
+
|
|
192
|
+```bash
|
|
193
|
+sudo yum install R
|
|
194
|
+sudo yum install R-devel
|
|
195
|
+sudo yum install libcurl-devel
|
|
196
|
+sudo ln -s /usr/lib64/libgfortran.so.3 /usr/lib64/libgfortran.so
|
|
197
|
+sudo ln -s /usr/lib64/libquadmath.so.0 /usr/lib64/libquadmath.so
|
|
198
|
+```
|
|
199
|
+
|
|
200
|
+To install the packages needed for these scripts you should run the following from within `R` itself:
|
|
201
|
+
|
|
202
|
+```R
|
|
203
|
+install.packages(c("dplyr", "tidyr", "ggplot2", "forecast", "TSA", "reticulate", "caTools", "scales", "optparse"))
|
|
204
|
+```
|