|
44 | 44 | <table style="width:100%;border:0px;border-spacing:0px;border-collapse:separate;margin-right:auto;margin-left:auto;">
|
45 | 45 | <tr style="padding:0px">
|
46 | 46 | <td style="padding:1.5%;width:16%;vertical-align:middle;text-align:center">
|
47 |
| - <a href="index.html" target="_self">Home</a> |
| 47 | + <a href="index.html" target="_self"><big>Home</big></a> |
48 | 48 | </td>
|
49 | 49 | </tr>
|
50 | 50 | </table>
|
|
95 | 95 | <br>
|
96 | 96 | <br>
|
97 | 97 | <tr>
|
98 |
| - Now that we have introduced the notion of discrete optimisation problems, we will discuss a method used for searching solutions : <b class="term">constraint programming</b> (CP). CP is a paradigm to solve discrete optimization problems. It lies on methods for <b class="term">problem declaration</b> and <b class="term">problem solving</b> using a solver that automates the constraint propagation mechanism. For more information about it, please look at <a href="https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjb1KrqiZX6AhU-FVkFHaUACBkQFnoECCoQAQ&url=https%3A%2F%2Fwww.cs.upc.edu%2F~erodri%2Fwebpage%2Fcps%2Ftheory%2Fcp%2Fintro%2Fslides.pdf&usg=AOvVaw3uGZ4zJqrdVZ5Z9THZWZNP" target="_blank">this presentation from Enric Rodrıguez-Carbonell </a>. |
| 98 | + Now that we have introduced the notion of discrete optimisation problems, we will discuss a method used for searching solutions : <b class="term">constraint programming</b> (CP). CP is a paradigm to solve discrete optimization problems. It lies on methods for <b class="term">problem declaration</b> and <b class="term">problem solving</b> using a solver that automates the constraint propagation mechanism (the process of inferring the possible values given the value of the variables already assigned, this is the process that everyone uses to solve a sudoku for example). For more information about it, please look at <a href="https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjb1KrqiZX6AhU-FVkFHaUACBkQFnoECCoQAQ&url=https%3A%2F%2Fwww.cs.upc.edu%2F~erodri%2Fwebpage%2Fcps%2Ftheory%2Fcp%2Fintro%2Fslides.pdf&usg=AOvVaw3uGZ4zJqrdVZ5Z9THZWZNP" target="_blank">this presentation from Enric Rodrıguez-Carbonell </a>. |
99 | 99 | <br>
|
100 | 100 | the strength of the CP comes from the diversity of the problems it is able to solve. It allows, among other things, to manage non-linear constraints or variables of different types (boolean, integer ...).
|
101 | 101 | <br>
|
|
123 | 123 | <li><b class="term">Variable Selection Heuristic</b> : Select the next variable to assign</li>
|
124 | 124 | <li><b class="term">Value Selection Heuristic</b> : Select the value to assign to the previously selected variable</li>
|
125 | 125 | </ul>
|
126 |
| - The design of these heuristics can have a tremendous impact on search performance. |
| 126 | + The design of these heuristics can have a tremendous impact on search performance, this is what makes the difference between a advanced sudoku player, who will find a solution on the first try and a beginner, who do multiple errors and changes before finding a solution. Moreover, for some problems, we don't even know heuristics that works well in the general case. |
127 | 127 | <br>
|
128 |
| - <br> |
129 |
| - My website is still under construction. Be patient I will add the end of the explanations soon ! |
130 |
| - </tbody></table> |
131 |
| - </td> |
132 |
| - </tr> |
| 128 | + This is where we bring |
| 129 | + <span class="block-line"><span><span style="color:#dd1313;">R</span><span style="color:#dd5013;">e</span><span style="color:#dd8913;">i</span><span style="color:#ddc213;">n</span><span style="color:#bedd13;">f</span><span style="color:#85dd13;">o</span><span style="color:#4cdd13;">r</span><span style="color:#13dd13;">c</span><span style="color:#13dd50;">e</span><span style="color:#13dd89;">m</span><span style="color:#13ddc2;">e</span><span style="color:#13bedd;">n</span><span style="color:#1385dd;">t </span></span><span><span style="color:#134cdd;">L</span><span style="color:#1313dd;">e</span><span style="color:#5013dd;">a</span><span style="color:#8913dd;">r</span><span style="color:#c213dd;">n</span><span style="color:#dd13be;">i</span><span style="color:#dd1385;">n</span><span style="color:#dd134c;">g</span></span></span> ! |
| 130 | + <br> |
| 131 | + <br> |
| 132 | + </tbody></table> |
| 133 | + <table width="100%" border="0" cellspacing="15" cellpadding="10"> |
| 134 | + <heading><b class="term">How to learn a decision making process ?</b></heading> |
| 135 | + <tr> |
| 136 | + <br> |
| 137 | + <br> |
| 138 | + The idea is to train a model using RL to learn a smart <b class="term">Value Selection Heuristic</b> by solving successively thousands of similar problems drawn for a specific distribution and evaluate the performance of this heuristic on problems never seen during training. The resolution process still relies on CP, which guarantees the validity of the returned solutions, while the agent is in charge of branching during the search. We propose a <b class="term">generic approach</b> where a single agent could learn problems of different nature, for this purpose we present a generic graph representation of any problems using the characteristics of CP problem definition. |
| 139 | + <br> |
| 140 | + <br> |
| 141 | + The RL algorithm used is <a href="https://www.researchgate.net/figure/The-Q-learning-algorithm-taken-from-Sutton-Barto-1998_fig1_337089781" target="_blank">Deep Q-Learning </a> (DQN). The associated <b class="term">reward</b> is based on the objective function/quality of the solution returned. |
| 142 | + |
| 143 | + <h3>Generic Graph representation</h3> |
| 144 | + Every CP problems is defined by a set of variable, values that can be taken by these variables and a set of Constraints on these variables. The idea is to encode each of these entities as node in a graph and connect these nodes according to whether : |
| 145 | + <ul> |
| 146 | + <li>A <b class="term">Value</b> is part of a <b class="term">Variable</b>'s domain of definition</li> |
| 147 | + <li>A <b class="term">Variable</b> is involved in a <b class="term">Constraint</b></li> |
| 148 | + </ul> |
| 149 | + Here is a little example : |
| 150 | + </tr> |
| 151 | + <br> |
| 152 | + <br> |
| 153 | + <tr> |
| 154 | + <div style="text-align: center;"> |
| 155 | + <img src="images/tripartite.png" alt="nthu" width="600" height="190"> |
| 156 | + </div> |
| 157 | + </tr> |
| 158 | + <br> |
| 159 | + <tr> |
| 160 | + The advantage of this method is that it allows the entire information to be encoded in the graph. Each node comes with a feature vector allowing to identify -among others- the type of constraints of a Constraint node or the value of a Value node |
| 161 | + <br> |
| 162 | + <br> |
| 163 | + My website is still under construction. I will add the end of the explanations soon :) ! |
| 164 | + </tr> |
| 165 | + </tr> |
133 | 166 | </table>
|
134 | 167 | <table style="width:100%;border:0px;border-spacing:0px;border-collapse:separate;margin-right:auto;margin-left:auto;"><tbody>
|
135 | 168 | <tr>
|
|
0 commit comments