编辑: 丶蓶一 2015-08-29
Stochastic?Knapsack Viswanath?Nagarajan University?of?Michigan MDS?Autumn?School Sep?29?C Oct?1,?2014

2 Adaptive?Stochastic?Optimization The?model: ? Known?distribution?D ? Unknown?realized?input?X ~?D Solution?(policy)?is?adaptive?sequence?of?actions? ? Each?action?reveals?extra?information?on?X Goal:??policy?having?best?expected objective Here:?maximization?objective 1.

Stochastic?knapsack 2. Stochastic?matching Stochastic?Knapsack Recall:?deterministic?knapsack?problem ? Jobs?with?reward?and?size;

?budget?B ? Maximize?reward?s.t.?total?size?≤ B Stochastic?knapsack? [Dean?Goemans?Vondrak?'04] ? Job?sizes?are?random Known,?arbitrary?distributions Independent Exact?size?known?only?after?selection ? Deterministic?rewards ? Maximize?expected?reward?s.t.?total?size?≤ B r1=2 r2=4 r3=5

5 7

3 B?=?9 OPT=6 r?=?1 S=0 Pr=1/2 S=5 Pr=1/2

3 Example:?Stochastic?Knapsack Feasible?solution:?select?jobs?sequentially r1 =?1 S1=0 Pr=1/2 S1=5 Pr=1/2 r2 =?1 S2=5 Pr=1 r3 =?10 S3=0 Pr=0.1 S3=6 Pr=0.9

1 2

3 3 S1=0 S1=5 S2=5 S3=0 S3=6 S3=0 S3=6

12 2

11 1 E?[?Reward?0.1*12?+?0.9*2]0.1*11?+?0.9*1]?=?2.5 Budget?B?=?5 No?r3 0.5 0.5 0.1 0.9 0.1 0.9

4 Representing?Solutions Decision?tree or?dynamic?program ? nodes represent?residual?budget and?job?chosen ? branches represent?random?size?outcomes ? a.k.a.?adaptive?policy .?.?.?. Eg.?policy?tree Describing?policy?may? take?exponential?space .?.?.?. Eg.?policy?dynamic?program

5 Simpler?Class?of?Policies Non?adaptive policy Add?jobs?in??a?priori fixed?order,?until?budget?exhausted Polynomial?space?representation:?permutation Adaptivity gap captures?loss?in?objective

1 2

3 max?instance?I OPT(AD(I)) OPT(NA(I)) . . .

6 Adaptivity?Gap?Example r1 =?1 S1=0 Pr=1/2 S1=5 Pr=1/2 r2 =?1 S2=5 Pr=1 r3 =?10 S3=0 Pr=0.1 S3=6 Pr=0.9

1 2

3 3 S1=0 S1=5 S2=5 S3=0 S3=6 S3=0 S3=6

1 2

3 12

2 11

1 E[Adaptive]?=?2.5 E?[?NonAdaptive?]?=?2.05 Adaptivity?gap?≈ 1.25

1 2

3 3

2 2

2 0

5 6

6 0

0 5

5 12

1 11

1 7 Why?Non\Adaptive?? Problem?consists?of??"offline"??and??"online"?phases ? Offline?=?before?any?job?is?run ? Online?=?when?jobs?are?running Non\adaptive: ? All?the?work?in?offline?phase ? Online?phase?easy/fast?to?implement

8 . . . .?.?. offline online . . . NA algo AD algo Approximation?Ratio Stochastic?Knapsack?instance?I OPT(I) = max?E?[objective?under?π]?:?π is?policy Contrast?with?online "competitive?ratio"?relative?to: EOPT(I) =?EX←D [optimal?value?on?input?X]? Eg.?n??identical jobs EOPT =?n/2?but?OPT =?2? Approximation?ratio?=?max?instance?I OPT(I) ALG(I) r?=?1 S=0 Pr=1/2 S=B+1 Pr=1/2 Here:?information?gradual?in?both?ALG?&?OPT

9 10 Outline 1. Stochastic?knapsack?(SK)?basics 2. Non\adaptive?algorithm?for?SK 3. Correlated?stochastic?knapsack?(CSK) Non\adaptive?algorithm 4. Adaptive?algorithm?for?CSK 5. Extensions Some?Definitions ? Jobs?[n]?:=?{1,2,?,n} ? ri =?reward of?job?i ? Si =?size of?job?i?(random?variable) ? Budget?B =?1?(by?scaling) ? Capped?mean?size?ei =?E?[?min{Si ,?1}?] ? Effective reward?wi =?ri ・ Pr[Si ≤ 1]

11 LP?Relaxation max? ?i=1 n wi ・ xi s.t. ?i=1 n ei ・ xi ≤

2 0?≤ xi ≤ 1,?????? i=1,?2,?…,?n Theorem:??LP*?≥ OPT? Ti :=?indicator?that?job?i chosen?in?OPT?(may?not?fit) Consider?LP?solution??xi =?Pr[Ti=1] Claim:???i=1 n min{Si ,?1}?・ Ti ≤

2 12

1 1 In?each?decision?path, at?most?one?overflows ends .?.?.?. ? x?∈ LP (Si &?Ti independent) LP?relaxation:?formal?proof ? At :=?set?of?first?t?jobs?chosen?in?OPT???(t=0,1,?,n) A0 =??,?An =?all?jobs?chosen?in?OPT Note?Ti=1?iff i ∈ An ? Yt :=??i∈At (min{Si ,?1}?C ei ) ? Conditioned?on?Yt and?next?job?j?: E[?Yt+1 |?Yt ,?j?]?=?Yt +?E[min{Sj ,?1}]?C ej =?Yt ? Y0 ? Yn martingale,???E[Yn]?=?E[Y0]?=?0 ? i.e.?E[?i∈An ei]?=?E[? i∈An min{Sj ,?1}]?≤

下载(注:源文件不在本站服务器,都将跳转到源网站下载)
备用下载
发帖评论
相关话题
发布一个新话题