|
@@ -24,7 +24,6 @@ I've included a bunch of annotated SQL queries in `sql/queries.pgsql` and `sql/w
|
24
|
24
|
|
25
|
25
|
`requirements.txt` contains the python packages required to set up a virtual environment with `virtualenv -p /usr/bin/python3 venv` and `pip install -r requirements.txt`. Notably these are:
|
26
|
26
|
|
27
|
|
-
|
28
|
27
|
* numpy
|
29
|
28
|
* pandas
|
30
|
29
|
* pkg-resources
|
|
@@ -35,3 +34,53 @@ I've included a bunch of annotated SQL queries in `sql/queries.pgsql` and `sql/w
|
35
|
34
|
* scipy
|
36
|
35
|
* seaborn
|
37
|
36
|
* statsmodels
|
|
37
|
+
|
|
38
|
+Virtual environments are loaded with `source venv/bin/activate`. The python scripts are in the `py/` folder. The scripts that are designed to be called directly are called by `python <scriptname.py>`; use `python <scriptname.py> -h` to view help. Note that most options will have a default, which may not be what you want, so always check.
|
|
39
|
+
|
|
40
|
+### `util.py`
|
|
41
|
+
|
|
42
|
+This script is imported by several other scripts, particularly for downloading the data from the database.
|
|
43
|
+
|
|
44
|
+### `downkwh.py`
|
|
45
|
+
|
|
46
|
+Downloads demand data from the database. Options:
|
|
47
|
+
|
|
48
|
+* `-o PATH`: The path for the python "pickle" file to store the result in.
|
|
49
|
+* `-s DATE`: The start date for the download in `YYYY-MM-DD` format; default of 2017-01-01.
|
|
50
|
+* `-e DATE`: The end date in `YYYY-MM-DD` format; default of 2018-01-01.
|
|
51
|
+* `-t TABLE`: The table in the database from which to obtain the wanted ICP ids; default is `public.icp_sample`, a table which contains 1000 ICPs with good data for 2017. **Important**: Don't assume that SQL injection can't come through this vector, although I have constrained the values that this script will accept from the command line to the following list:
|
|
52
|
+ * `public.best_icp`, All icps with at least 360 days of data in 2017
|
|
53
|
+ * `public.best_icp_1618`, All icps with at least 720 days of data in 2 years from 1 April 2016
|
|
54
|
+ * `public.best_icp_18m`, All icps with at least 540 days of data from July 2016 to end of 2017
|
|
55
|
+ * `public.icp_sample`, A pre-generated 1k sample from best_icp
|
|
56
|
+ * `public.icp_sample_5k`, A pre-generated 5k sample from best_icp
|
|
57
|
+ * `public.icp_sample_1618`, A pre-generated 1k sample from best_icp_1618
|
|
58
|
+ * `public.icp_sample_18m`, A pre-generated 1k sample from best_icp_18m
|
|
59
|
+* `-n NUM`: The algorithm downloads the dataset in pieces, optimises them to reduce storage space, and reassembles. This option defines the number of such pieces; it should always be less than the number of days between the start and end days. Default of 12.
|
|
60
|
+* `--no-pivot`: This option can probably be ignored, as it downloads the dataset in a less efficient non-"pivoted" form, which was used in the original versions of some of these scripts.
|
|
61
|
+* `-v`: Output some extra progress information as it goes; mostly useful for debugging.
|
|
62
|
+
|
|
63
|
+Example:
|
|
64
|
+
|
|
65
|
+```bash
|
|
66
|
+python downkwh.py -o ../data/test1k.pkl -n 24
|
|
67
|
+```
|
|
68
|
+
|
|
69
|
+Downloads data from the default period into `../data/test1k.pkl` with 24 segments used.
|
|
70
|
+
|
|
71
|
+### `downweather.py`
|
|
72
|
+
|
|
73
|
+Downloads weather (temperature and humidity) data from the database, from one specified station.
|
|
74
|
+
|
|
75
|
+* `-o PATH`: The path for the python "pickle" file to store the result in.
|
|
76
|
+* `-s DATE`: The start date for the download in `YYYY-MM-DD` format; default of 2016-04-01.
|
|
77
|
+* `-e DATE`: The end date in `YYYY-MM-DD` format; default of 2019-01-01.
|
|
78
|
+* `--station`: The station to download from; default is 2006 which is located near Pukekohe.
|
|
79
|
+* `-v`: Output some extra progress information as it goes; mostly useful for debugging.
|
|
80
|
+
|
|
81
|
+Example:
|
|
82
|
+
|
|
83
|
+```bash
|
|
84
|
+python downweather.py -o ../data/weathertest.pkl
|
|
85
|
+```
|
|
86
|
+Downloads data from the default period into `../data/weathertest.pkl`.
|