1 |
From d4f872998373d064acf6a639efccc57922d28495 Mon Sep 17 00:00:00 2001 |
2 |
From: Paolo Valente <paolo.valente@linaro.org> |
3 |
Date: Fri, 28 Jul 2017 21:09:51 +0200 |
4 |
Subject: [PATCH 3/4] block, bfq: consider also in_service_entity to state |
5 |
whether an entity is active |
6 |
|
7 |
Groups of BFQ queues are represented by generic entities in BFQ. When |
8 |
a queue belonging to a parent entity is deactivated, the parent entity |
9 |
may need to be deactivated too, in case the deactivated queue was the |
10 |
only active queue for the parent entity. This deactivation may need to |
11 |
be propagated upwards if the entity belongs, in its turn, to a further |
12 |
higher-level entity, and so on. In particular, the upward propagation |
13 |
of deactivation stops at the first parent entity that remains active |
14 |
even if one of its child entities has been deactivated. |
15 |
|
16 |
To decide whether the last non-deactivation condition holds for a |
17 |
parent entity, BFQ checks whether the field next_in_service is still |
18 |
not NULL for the parent entity, after the deactivation of one of its |
19 |
child entity. If it is not NULL, then there are certainly other active |
20 |
entities in the parent entity, and deactivations can stop. |
21 |
|
22 |
Unfortunately, this check misses a corner case: if in_service_entity |
23 |
is not NULL, then next_in_service may happen to be NULL, although the |
24 |
parent entity is evidently active. This happens if: 1) the entity |
25 |
pointed by in_service_entity is the only active entity in the parent |
26 |
entity, and 2) according to the definition of next_in_service, the |
27 |
in_service_entity cannot be considered as next_in_service. See the |
28 |
comments on the definition of next_in_service for details on this |
29 |
second point. |
30 |
|
31 |
Hitting the above corner case causes crashes. |
32 |
|
33 |
To address this issue, this commit: |
34 |
1) Extends the above check on only next_in_service to controlling both |
35 |
next_in_service and in_service_entity (if any of them is not NULL, |
36 |
then no further deactivation is performed) |
37 |
2) Improves the (important) comments on how next_in_service is defined |
38 |
and updated; in particular it fixes a few rather obscure paragraphs |
39 |
|
40 |
Reported-by: Eric Wheeler <bfq-sched@lists.ewheeler.net> |
41 |
Reported-by: Rick Yiu <rick_yiu@htc.com> |
42 |
Reported-by: Tom X Nguyen <tom81094@gmail.com> |
43 |
Signed-off-by: Paolo Valente <paolo.valente@linaro.org> |
44 |
Tested-by: Eric Wheeler <bfq-sched@lists.ewheeler.net> |
45 |
Tested-by: Rick Yiu <rick_yiu@htc.com> |
46 |
Tested-by: Laurentiu Nicola <lnicola@dend.ro> |
47 |
Tested-by: Tom X Nguyen <tom81094@gmail.com> |
48 |
--- |
49 |
block/bfq-sched.c | 140 ++++++++++++++++++++++++++++++------------------------ |
50 |
block/bfq.h | 23 +++++++-- |
51 |
2 files changed, 95 insertions(+), 68 deletions(-) |
52 |
|
53 |
diff --git a/block/bfq-sched.c b/block/bfq-sched.c |
54 |
index fdf1c713d050..be985d9d5f17 100644 |
55 |
--- a/block/bfq-sched.c |
56 |
+++ b/block/bfq-sched.c |
57 |
@@ -196,21 +196,23 @@ static bool bfq_update_parent_budget(struct bfq_entity *next_in_service) |
58 |
|
59 |
/* |
60 |
* This function tells whether entity stops being a candidate for next |
61 |
- * service, according to the following logic. |
62 |
+ * service, according to the restrictive definition of the field |
63 |
+ * next_in_service. In particular, this function is invoked for an |
64 |
+ * entity that is about to be set in service. |
65 |
* |
66 |
- * This function is invoked for an entity that is about to be set in |
67 |
- * service. If such an entity is a queue, then the entity is no longer |
68 |
- * a candidate for next service (i.e, a candidate entity to serve |
69 |
- * after the in-service entity is expired). The function then returns |
70 |
- * true. |
71 |
+ * If entity is a queue, then the entity is no longer a candidate for |
72 |
+ * next service according to the that definition, because entity is |
73 |
+ * about to become the in-service queue. This function then returns |
74 |
+ * true if entity is a queue. |
75 |
* |
76 |
- * In contrast, the entity could stil be a candidate for next service |
77 |
- * if it is not a queue, and has more than one child. In fact, even if |
78 |
- * one of its children is about to be set in service, other children |
79 |
- * may still be the next to serve. As a consequence, a non-queue |
80 |
- * entity is not a candidate for next-service only if it has only one |
81 |
- * child. And only if this condition holds, then the function returns |
82 |
- * true for a non-queue entity. |
83 |
+ * In contrast, entity could still be a candidate for next service if |
84 |
+ * it is not a queue, and has more than one active child. In fact, |
85 |
+ * even if one of its children is about to be set in service, other |
86 |
+ * active children may still be the next to serve, for the parent |
87 |
+ * entity, even according to the above definition. As a consequence, a |
88 |
+ * non-queue entity is not a candidate for next-service only if it has |
89 |
+ * only one active child. And only if this condition holds, then this |
90 |
+ * function returns true for a non-queue entity. |
91 |
*/ |
92 |
static bool bfq_no_longer_next_in_service(struct bfq_entity *entity) |
93 |
{ |
94 |
@@ -223,6 +225,18 @@ static bool bfq_no_longer_next_in_service(struct bfq_entity *entity) |
95 |
|
96 |
BUG_ON(bfqg == ((struct bfq_data *)(bfqg->bfqd))->root_group); |
97 |
BUG_ON(bfqg->active_entities == 0); |
98 |
+ /* |
99 |
+ * The field active_entities does not always contain the |
100 |
+ * actual number of active children entities: it happens to |
101 |
+ * not account for the in-service entity in case the latter is |
102 |
+ * removed from its active tree (which may get done after |
103 |
+ * invoking the function bfq_no_longer_next_in_service in |
104 |
+ * bfq_get_next_queue). Fortunately, here, i.e., while |
105 |
+ * bfq_no_longer_next_in_service is not yet completed in |
106 |
+ * bfq_get_next_queue, bfq_active_extract has not yet been |
107 |
+ * invoked, and thus active_entities still coincides with the |
108 |
+ * actual number of active entities. |
109 |
+ */ |
110 |
if (bfqg->active_entities == 1) |
111 |
return true; |
112 |
|
113 |
@@ -1089,7 +1103,7 @@ static void bfq_update_fin_time_enqueue(struct bfq_entity *entity, |
114 |
* one of its children receives a new request. |
115 |
* |
116 |
* Basically, this function updates the timestamps of entity and |
117 |
- * inserts entity into its active tree, ater possible extracting it |
118 |
+ * inserts entity into its active tree, ater possibly extracting it |
119 |
* from its idle tree. |
120 |
*/ |
121 |
static void __bfq_activate_entity(struct bfq_entity *entity, |
122 |
@@ -1213,7 +1227,7 @@ static void __bfq_requeue_entity(struct bfq_entity *entity) |
123 |
BUG_ON(entity->tree && entity->tree != &st->active); |
124 |
/* |
125 |
* In addition, if the entity had more than one child |
126 |
- * when set in service, then was not extracted from |
127 |
+ * when set in service, then it was not extracted from |
128 |
* the active tree. This implies that the position of |
129 |
* the entity in the active tree may need to be |
130 |
* changed now, because we have just updated the start |
131 |
@@ -1221,9 +1235,8 @@ static void __bfq_requeue_entity(struct bfq_entity *entity) |
132 |
* time in a moment (the requeueing is then, more |
133 |
* precisely, a repositioning in this case). To |
134 |
* implement this repositioning, we: 1) dequeue the |
135 |
- * entity here, 2) update the finish time and |
136 |
- * requeue the entity according to the new |
137 |
- * timestamps below. |
138 |
+ * entity here, 2) update the finish time and requeue |
139 |
+ * the entity according to the new timestamps below. |
140 |
*/ |
141 |
if (entity->tree) |
142 |
bfq_active_extract(st, entity); |
143 |
@@ -1270,9 +1283,9 @@ static void __bfq_activate_requeue_entity(struct bfq_entity *entity, |
144 |
|
145 |
|
146 |
/** |
147 |
- * bfq_activate_entity - activate or requeue an entity representing a bfq_queue, |
148 |
- * and activate, requeue or reposition all ancestors |
149 |
- * for which such an update becomes necessary. |
150 |
+ * bfq_activate_requeue_entity - activate or requeue an entity representing a bfq_queue, |
151 |
+ * and activate, requeue or reposition all ancestors |
152 |
+ * for which such an update becomes necessary. |
153 |
* @entity: the entity to activate. |
154 |
* @non_blocking_wait_rq: true if this entity was waiting for a request |
155 |
* @requeue: true if this is a requeue, which implies that bfqq is |
156 |
@@ -1308,9 +1321,9 @@ static void bfq_activate_requeue_entity(struct bfq_entity *entity, |
157 |
* @ins_into_idle_tree: if false, the entity will not be put into the |
158 |
* idle tree. |
159 |
* |
160 |
- * Deactivates an entity, independently from its previous state. Must |
161 |
+ * Deactivates an entity, independently of its previous state. Must |
162 |
* be invoked only if entity is on a service tree. Extracts the entity |
163 |
- * from that tree, and if necessary and allowed, puts it on the idle |
164 |
+ * from that tree, and if necessary and allowed, puts it into the idle |
165 |
* tree. |
166 |
*/ |
167 |
static bool __bfq_deactivate_entity(struct bfq_entity *entity, |
168 |
@@ -1359,7 +1372,7 @@ static bool __bfq_deactivate_entity(struct bfq_entity *entity, |
169 |
/** |
170 |
* bfq_deactivate_entity - deactivate an entity representing a bfq_queue. |
171 |
* @entity: the entity to deactivate. |
172 |
- * @ins_into_idle_tree: true if the entity can be put on the idle tree |
173 |
+ * @ins_into_idle_tree: true if the entity can be put into the idle tree |
174 |
*/ |
175 |
static void bfq_deactivate_entity(struct bfq_entity *entity, |
176 |
bool ins_into_idle_tree, |
177 |
@@ -1406,16 +1419,29 @@ static void bfq_deactivate_entity(struct bfq_entity *entity, |
178 |
*/ |
179 |
bfq_update_next_in_service(sd, NULL); |
180 |
|
181 |
- if (sd->next_in_service) { |
182 |
+ if (sd->next_in_service || sd->in_service_entity) { |
183 |
/* |
184 |
- * The parent entity is still backlogged, |
185 |
- * because next_in_service is not NULL. So, no |
186 |
- * further upwards deactivation must be |
187 |
- * performed. Yet, next_in_service has |
188 |
- * changed. Then the schedule does need to be |
189 |
- * updated upwards. |
190 |
+ * The parent entity is still active, because |
191 |
+ * either next_in_service or in_service_entity |
192 |
+ * is not NULL. So, no further upwards |
193 |
+ * deactivation must be performed. Yet, |
194 |
+ * next_in_service has changed. Then the |
195 |
+ * schedule does need to be updated upwards. |
196 |
+ * |
197 |
+ * NOTE If in_service_entity is not NULL, then |
198 |
+ * next_in_service may happen to be NULL, |
199 |
+ * although the parent entity is evidently |
200 |
+ * active. This happens if 1) the entity |
201 |
+ * pointed by in_service_entity is the only |
202 |
+ * active entity in the parent entity, and 2) |
203 |
+ * according to the definition of |
204 |
+ * next_in_service, the in_service_entity |
205 |
+ * cannot be considered as |
206 |
+ * next_in_service. See the comments on the |
207 |
+ * definition of next_in_service for details. |
208 |
*/ |
209 |
BUG_ON(sd->next_in_service == entity); |
210 |
+ BUG_ON(sd->in_service_entity == entity); |
211 |
break; |
212 |
} |
213 |
|
214 |
@@ -1806,45 +1832,33 @@ static struct bfq_queue *bfq_get_next_queue(struct bfq_data *bfqd) |
215 |
|
216 |
/* |
217 |
* If entity is no longer a candidate for next |
218 |
- * service, then we extract it from its active tree, |
219 |
- * for the following reason. To further boost the |
220 |
- * throughput in some special case, BFQ needs to know |
221 |
- * which is the next candidate entity to serve, while |
222 |
- * there is already an entity in service. In this |
223 |
- * respect, to make it easy to compute/update the next |
224 |
- * candidate entity to serve after the current |
225 |
- * candidate has been set in service, there is a case |
226 |
- * where it is necessary to extract the current |
227 |
- * candidate from its service tree. Such a case is |
228 |
- * when the entity just set in service cannot be also |
229 |
- * a candidate for next service. Details about when |
230 |
- * this conditions holds are reported in the comments |
231 |
- * on the function bfq_no_longer_next_in_service() |
232 |
- * invoked below. |
233 |
+ * service, then it must be extracted from its active |
234 |
+ * tree, so as to make sure that it won't be |
235 |
+ * considered when computing next_in_service. See the |
236 |
+ * comments on the function |
237 |
+ * bfq_no_longer_next_in_service() for details. |
238 |
*/ |
239 |
if (bfq_no_longer_next_in_service(entity)) |
240 |
bfq_active_extract(bfq_entity_service_tree(entity), |
241 |
entity); |
242 |
|
243 |
/* |
244 |
- * For the same reason why we may have just extracted |
245 |
- * entity from its active tree, we may need to update |
246 |
- * next_in_service for the sched_data of entity too, |
247 |
- * regardless of whether entity has been extracted. |
248 |
- * In fact, even if entity has not been extracted, a |
249 |
- * descendant entity may get extracted. Such an event |
250 |
- * would cause a change in next_in_service for the |
251 |
- * level of the descendant entity, and thus possibly |
252 |
- * back to upper levels. |
253 |
+ * Even if entity is not to be extracted according to |
254 |
+ * the above check, a descendant entity may get |
255 |
+ * extracted in one of the next iterations of this |
256 |
+ * loop. Such an event could cause a change in |
257 |
+ * next_in_service for the level of the descendant |
258 |
+ * entity, and thus possibly back to this level. |
259 |
* |
260 |
- * We cannot perform the resulting needed update |
261 |
- * before the end of this loop, because, to know which |
262 |
- * is the correct next-to-serve candidate entity for |
263 |
- * each level, we need first to find the leaf entity |
264 |
- * to set in service. In fact, only after we know |
265 |
- * which is the next-to-serve leaf entity, we can |
266 |
- * discover whether the parent entity of the leaf |
267 |
- * entity becomes the next-to-serve, and so on. |
268 |
+ * However, we cannot perform the resulting needed |
269 |
+ * update of next_in_service for this level before the |
270 |
+ * end of the whole loop, because, to know which is |
271 |
+ * the correct next-to-serve candidate entity for each |
272 |
+ * level, we need first to find the leaf entity to set |
273 |
+ * in service. In fact, only after we know which is |
274 |
+ * the next-to-serve leaf entity, we can discover |
275 |
+ * whether the parent entity of the leaf entity |
276 |
+ * becomes the next-to-serve, and so on. |
277 |
*/ |
278 |
|
279 |
/* Log some information */ |
280 |
diff --git a/block/bfq.h b/block/bfq.h |
281 |
index 6269fd7ac5ae..1fa6ed3c604e 100644 |
282 |
--- a/block/bfq.h |
283 |
+++ b/block/bfq.h |
284 |
@@ -68,17 +68,30 @@ struct bfq_service_tree { |
285 |
* |
286 |
* bfq_sched_data is the basic scheduler queue. It supports three |
287 |
* ioprio_classes, and can be used either as a toplevel queue or as an |
288 |
- * intermediate queue on a hierarchical setup. @next_in_service |
289 |
- * points to the active entity of the sched_data service trees that |
290 |
- * will be scheduled next. It is used to reduce the number of steps |
291 |
- * needed for each hierarchical-schedule update. |
292 |
+ * intermediate queue in a hierarchical setup. |
293 |
* |
294 |
* The supported ioprio_classes are the same as in CFQ, in descending |
295 |
* priority order, IOPRIO_CLASS_RT, IOPRIO_CLASS_BE, IOPRIO_CLASS_IDLE. |
296 |
* Requests from higher priority queues are served before all the |
297 |
* requests from lower priority queues; among requests of the same |
298 |
* queue requests are served according to B-WF2Q+. |
299 |
- * All the fields are protected by the queue lock of the containing bfqd. |
300 |
+ * |
301 |
+ * The schedule is implemented by the service trees, plus the field |
302 |
+ * @next_in_service, which points to the entity on the active trees |
303 |
+ * that will be served next, if 1) no changes in the schedule occurs |
304 |
+ * before the current in-service entity is expired, 2) the in-service |
305 |
+ * queue becomes idle when it expires, and 3) if the entity pointed by |
306 |
+ * in_service_entity is not a queue, then the in-service child entity |
307 |
+ * of the entity pointed by in_service_entity becomes idle on |
308 |
+ * expiration. This peculiar definition allows for the following |
309 |
+ * optimization, not yet exploited: while a given entity is still in |
310 |
+ * service, we already know which is the best candidate for next |
311 |
+ * service among the other active entitities in the same parent |
312 |
+ * entity. We can then quickly compare the timestamps of the |
313 |
+ * in-service entity with those of such best candidate. |
314 |
+ * |
315 |
+ * All the fields are protected by the queue lock of the containing |
316 |
+ * bfqd. |
317 |
*/ |
318 |
struct bfq_sched_data { |
319 |
struct bfq_entity *in_service_entity; /* entity in service */ |
320 |
-- |
321 |
2.14.1 |
322 |
|