Browse Source

More database notes

Petra Lamborn 5 years ago
parent
commit
226b632cb4
1 changed files with 133 additions and 1 deletions
  1. 133
    1
      notes.md

+ 133
- 1
notes.md View File

@@ -7,7 +7,8 @@ Accessed either via the SQL Manager program on the laptop, the `psql` terminal c
7 7
 Have created an experimental table called `public.coup_tall_april` containing data from April 2017 in a "tall" format, using code from Jason:
8 8
 
9 9
 ```sql
10
-CREATE TABLE public.coup_tall_April AS
10
+
11
+CREATE TABLE public.coup_tall_april AS
11 12
 SELECT  a.icp_id
12 13
      , a.read_date
13 14
      , c.period
@@ -21,12 +22,15 @@ WHERE   a.read_date >= to_date('01/04/2017','dd/mm/yyyy')
21 22
  and   a.content_code  ~ ('UN|CN|EG')
22 23
 GROUP BY 1, 2, 3
23 24
 ORDER BY 1, 2, 3;
25
+
24 26
 ```
25 27
 
26 28
 This data looks like:
27 29
 
28 30
 ```sql
31
+
29 32
 SELECT * FROM public.coup_tall_april limit 10;
33
+
30 34
 ```
31 35
 
32 36
  icp_id  | read_date  | period | kwh_tot | kwh_un | kwh_cn
@@ -48,3 +52,131 @@ SELECT * FROM public.coup_tall_april limit 10;
48 52
 * `kwh_cn` is the demand in kwh that the company has some level of control over, e.g. by systems that turn off and on water heaters remotely. This ought to be relatively stable, although in many cases this will be 0.
49 53
 * `kwh_un` is the uncontrolled demand, i.e. the rest.
50 54
 * `kwh_tot` is the sum of the other kwh measurements for this half-hour
55
+
56
+## Statistics
57
+
58
+This dataset includes 34278 distinct ICPs: `SELECT COUNT(DISTINCT icp_id) FROM public.coup_tall_april;`
59
+
60
+Not every day has the same number of ICPs recorded for it(?):
61
+
62
+```sql
63
+
64
+SELECT read_date, COUNT(DISTINCT icp_id) AS d_icp 
65
+FROM public.coup_tall_april 
66
+GROUP BY read_date;
67
+
68
+```
69
+
70
+ read_date  | d_icp
71
+------------+-------
72
+ 2017-04-01 | 34080
73
+ 2017-04-02 | 34070
74
+ 2017-04-03 | 34082
75
+ 2017-04-04 | 34085
76
+ 2017-04-05 | 34083
77
+ 2017-04-06 | 34078
78
+ 2017-04-07 | 34084
79
+ 2017-04-08 | 34085
80
+ 2017-04-09 | 34079
81
+ 2017-04-10 | 34097
82
+ 2017-04-11 | 34102
83
+ 2017-04-12 | 34095
84
+ 2017-04-13 | 34127
85
+ 2017-04-14 | 34127
86
+ 2017-04-15 | 34128
87
+ 2017-04-16 | 34122
88
+ 2017-04-17 | 34119
89
+ 2017-04-18 | 34161
90
+ 2017-04-19 | 34178
91
+ 2017-04-20 | 34181
92
+ 2017-04-21 | 34190
93
+ 2017-04-22 | 34187
94
+ 2017-04-23 | 34178
95
+ 2017-04-24 | 34190
96
+ 2017-04-25 | 34180
97
+ 2017-04-26 | 34199
98
+ 2017-04-27 | 34193
99
+ 2017-04-28 | 34194
100
+ 2017-04-29 | 34179
101
+ 2017-04-30 | 34162
102
+
103
+
104
+Days have similar averages (within the same month), but sometimes values are negative:
105
+
106
+```sql
107
+
108
+SELECT read_date, min(kwh_tot), Avg(kwh_tot), max(kwh_tot) 
109
+FROM public.coup_tall_april 
110
+GROUP BY read_date;
111
+
112
+```
113
+
114
+ read_date  |   min   |          avg           |        max
115
+------------+---------+------------------------+--------------------
116
+ 2017-04-01 |     0.0 | 0.41225447048611111114 |             30.928
117
+ 2017-04-02 |     0.0 | 0.42826891265531748365 |             28.153
118
+ 2017-04-03 |     0.0 | 0.43139900216145374882 |             28.041
119
+ 2017-04-04 |     0.0 | 0.44293095264290254757 |             31.111
120
+ 2017-04-05 |     0.0 | 0.44780382081976351847 | 29.100999999999999
121
+ 2017-04-06 |     0.0 | 0.43275886569047479314 |             28.067
122
+ 2017-04-07 |     0.0 | 0.42198233958748973126 |             37.413
123
+ 2017-04-08 |     0.0 | 0.42289754718595667695 |             25.908
124
+ 2017-04-09 |     0.0 | 0.43495351487230650355 |             30.373
125
+ 2017-04-10 |     0.0 | 0.42881511692133227754 |             26.791
126
+ 2017-04-11 |     0.0 | 0.42526183459425644637 |             35.234
127
+ 2017-04-12 | -30.530 | 0.44193485420638412283 |             29.818
128
+ 2017-04-13 |     0.0 | 0.44908520990222795247 |             31.721
129
+ 2017-04-14 |     0.0 | 0.43110074745314071950 |             27.167
130
+ 2017-04-15 |     0.0 | 0.41132879527074542899 |             30.746
131
+ 2017-04-16 |     0.0 | 0.41155711552175527034 |             26.713
132
+ 2017-04-17 |     0.0 | 0.42657600420586769836 |             27.751
133
+ 2017-04-18 |     0.0 | 0.43113055579949845344 |             34.414
134
+ 2017-04-19 |     0.0 | 0.43415263473579495584 |             26.547
135
+ 2017-04-20 |     0.0 | 0.43513854431799342716 |             27.124
136
+ 2017-04-21 |     0.0 | 0.43824147350589841087 |             30.365
137
+ 2017-04-22 |     0.0 | 0.42576349014245180919 |             31.112
138
+ 2017-04-23 |     0.0 | 0.44098844956307176160 |             31.099
139
+ 2017-04-24 |     0.0 | 0.43441511284976113876 |             27.109
140
+ 2017-04-25 |     0.0 | 0.43805378632728691245 |             25.776
141
+ 2017-04-26 |     0.0 | 0.44017528594890688813 |             26.907
142
+ 2017-04-27 |     0.0 | 0.44285652216828005733 |             29.544
143
+ 2017-04-28 |     0.0 | 0.43437889688249400482 |             29.598
144
+ 2017-04-29 | -31.624 | 0.45286084474384856201 |             31.874
145
+ 2017-04-30 |     0.0 | 0.46408289668832816191 |             31.960
146
+
147
+Three values in this table are negative:
148
+
149
+```sql
150
+
151
+SELECT * FROM public.coup_tall_april WHERE kwh_tot < 0 OR kwh_un < 0 OR kwh_cn < 0;
152
+
153
+```
154
+
155
+ icp_id  | read_date  | period | kwh_tot | kwh_un  | kwh_cn
156
+---------+------------+--------+---------+---------+--------
157
+ I017181 | 2017-04-12 |     19 | -30.530 | -30.585 |  0.055
158
+ I019141 | 2017-04-29 |     37 | -31.445 | -31.445 |      0
159
+ I019141 | 2017-04-29 |     38 | -31.624 | -31.624 |      0
160
+
161
+There are 334 values in this table where the `icp_id` ends in 17:
162
+
163
+```sql
164
+
165
+SELECT COUNT (DISTINCT icp_id) FROM public.coup_tall_april WHERE icp_id LIKE '%17';
166
+
167
+SELECT DISTINCT icp_id FROM public.coup_tall_april WHERE icp_id LIKE '%17' ORDER BY icp_id LIMIT 10;
168
+
169
+```
170
+
171
+ icp_id
172
+---------
173
+ I000117
174
+ I000217
175
+ I000417
176
+ I000517
177
+ I000617
178
+ I000817
179
+ I001117
180
+ I001217
181
+ I001317
182
+ I001417