Skip to content

Commit 99d287c

Browse files
committed
update pubs
1 parent 0931eaf commit 99d287c

File tree

9 files changed

+143
-15
lines changed

9 files changed

+143
-15
lines changed

_data/allpubs.yml

Lines changed: 57 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,52 @@
1+
- title: 'Verifying Distributed Deep Learning Training via Parallelization Equivalence'
2+
conference: SOSP 2025
3+
year: 2025
4+
authors:
5+
- yunchi
6+
- chengtan
7+
- youshan
8+
- ryan
9+
- yizhu
10+
- xianzhang
11+
- fanyang
12+
links:
13+
conference:
14+
url: https://sigops.org/s/conferences/sosp/2025/
15+
paper:
16+
url: '#'
17+
- title: 'Optimistic Recovery for High-Availability Software via Partial Process State Preservation'
18+
conference: SOSP 2025
19+
year: 2025
20+
authors:
21+
- yuzhuo
22+
- yuqimai
23+
- angting
24+
- yichen
25+
- wanning
26+
- xiaoyang
27+
- peter
28+
- ryan
29+
links:
30+
conference:
31+
url: https://sigops.org/s/conferences/sosp/2025/
32+
paper:
33+
url: '#'
34+
- title: 'Mitigating Application Resource Overload with Targeted Task Cancellation'
35+
conference: SOSP 2025
36+
year: 2025
37+
authors:
38+
- yigong
39+
- zeyin
40+
- yicheng
41+
- yile
42+
- shuangyu
43+
- baris
44+
- ryan
45+
links:
46+
conference:
47+
url: https://sigops.org/s/conferences/sosp/2025/
48+
paper:
49+
url: '#'
150
- title: 'Training with Confidence: Catching Silent Errors in Deep Learning Training with Automated Proactive Checks'
251
conference: OSDI 2025
352
year: 2025
@@ -12,7 +61,11 @@
1261
conference:
1362
url: https://www.usenix.org/conference/osdi25
1463
paper:
15-
url: '#'
64+
url: paper/traincheck-osdi25-preprint.pdf
65+
bib:
66+
url: paper/traincheck-osdi25.bib
67+
arxiv:
68+
url: https://www.arxiv.org/abs/2506.14813
1669
software:
1770
url: https://github.com/OrderLab/TrainCheck
1871
- title: 'Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems'
@@ -33,7 +86,9 @@
3386
conference:
3487
url: https://www.usenix.org/conference/osdi25
3588
paper:
36-
url: '#'
89+
url: paper/t2c-osdi25-preprint.pdf
90+
bib:
91+
url: paper/t2c-osdi25.bib
3792
software:
3893
url: https://github.com/OrderLab/T2C
3994
- title: 'One-Size-Fits-None: Understanding and Enhancing Slow-Fault Tolerance in Modern Distributed Systems'

_data/authors.yml

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ calvin:
109109
url: http://cs.unc.edu/~cd
110110
yigong:
111111
name: Yigong Hu
112-
url: https://www.cs.jhu.edu/~yigonghu
112+
url: https://yigonghu.github.io
113113
suyi:
114114
name: Suyi Liu
115115
url: https://sylll.github.io
@@ -251,3 +251,44 @@ beijie:
251251
name: Beijie Liu
252252
runhui:
253253
name: Runhui Xu
254+
chengtan:
255+
name: Cheng Tan
256+
url: https://naizhengtan.github.io
257+
youshan:
258+
name: Youshan Miao
259+
url: https://www.microsoft.com/en-us/research/people/yomia
260+
yizhu:
261+
name: Yi Zhu
262+
url: https://www.microsoft.com/en-us/research/people/yizhu1
263+
xianzhang:
264+
name: Xian Zhang
265+
url: https://www.microsoft.com/en-us/research/people/zhxian
266+
fanyang:
267+
name: Fan Yang
268+
url: https://www.microsoft.com/en-us/research/people/fanyang
269+
yuqimai:
270+
name: Yuqi Mai
271+
angting:
272+
name: Angting Cai
273+
yichen:
274+
name: Yi Chen
275+
url: https://chenyi.world
276+
wanning:
277+
name: Wanning He
278+
url: https://hwanning.netlify.app
279+
xiaoyang:
280+
name: Xiaoyang Qian
281+
peter:
282+
name: Peter M. Chen
283+
url: https://web.eecs.umich.edu/~pmchen
284+
zeyin:
285+
name: Zeyin Zhang
286+
yicheng:
287+
name: Yicheng Liu
288+
yile:
289+
name: Yile Gu
290+
shuangyu:
291+
name: Shuangyu Lei
292+
baris:
293+
name: Baris Kasikci
294+
url: https://homes.cs.washington.edu/~baris

_includes/head.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,5 @@
1717
<link rel="stylesheet" href="{{ '/vendors/flaticon/flaticon.css' | relative_url }}" />
1818
<link rel="stylesheet" href="{{ '/vendors/owl-carousel/owl.carousel.min.css' | relative_url }}">
1919
<link rel="stylesheet" href="{{ '/assets/css/style.css' | relative_url }}">
20+
<link rel="stylesheet" href="{{ '/assets/css/site.css' | relative_url }}">
2021
</head>

_layouts/publication.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
<section class="section-margin">
2+
<div class="container">
13
{%- assign allpubs = site.data.allpubs -%}
24
{%- assign prevyear = null -%}
35
{%- for pub in allpubs -%}
@@ -12,3 +14,5 @@ <h2 id="publications">{{ pub.year }}</h2>
1214
{%- include pubitem.html pub=pub -%}
1315
{%- endfor %}
1416
</ul>
17+
</div>
18+
</section>

assets/css/site.css

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.authorlist i {
2+
color: #333;
3+
}

paper/t2c-osdi25-preprint.pdf

840 KB
Binary file not shown.

paper/t2c-osdi25.bib

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
@inproceedings{T2COSDI2025,
2+
author = {Lou, Chang and Parikesit, Dimas Shidqi and Huang, Yujin and Yang, Zhewen and Diwangkara, Senapati and Jing, Yuzhuo and Kistijantoro, Achmad Imam and Yuan, Ding and Nath, Suman and Huang, Peng},
3+
title = {Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems},
4+
booktitle = {Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation},
5+
series = {OSDI '25},
6+
month = {July},
7+
year = {2025},
8+
address = {Boston, MA, USA},
9+
publisher = {USENIX Association},
10+
}

paper/traincheck-osdi25-preprint.pdf

0 Bytes
Binary file not shown.

pubs.html

Lines changed: 26 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,19 +3,33 @@
33
title: Publications
44
---
55
<section class="section-margin">
6-
<div class="container">
6+
<div class="container">
77
<h2 id="publications">2025</h2>
88
<ul class="publications">
99
<li>
10-
<a target="_blank" href="paper/traincheck-osdi25-preprint.pdf">Training with Confidence: Catching Silent Errors in Deep Learning Training with Automated Proactive Checks/a><br>
10+
<a target="_blank" href="#">Verifying Distributed Deep Learning Training via Parallelization Equivalence</a><br>
11+
<span class="authorlist"><i><a href="https://mercury-browser-ede.notion.site/yunchi" class="nodec">Yunchi Lu</a>, </i><i><a href="https://naizhengtan.github.io" class="nodec">Cheng Tan</a>, </i><i><a href="https://www.microsoft.com/en-us/research/people/yomia" class="nodec">Youshan Miao</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a>, </i><i><a href="https://www.microsoft.com/en-us/research/people/yizhu1" class="nodec">Yi Zhu</a>, </i><i><a href="https://www.microsoft.com/en-us/research/people/zhxian" class="nodec">Xian Zhang</a>, </i><i><a href="https://www.microsoft.com/en-us/research/people/fanyang" class="nodec">Fan Yang</a><br></i></span>
12+
<a target="_blank" href="https://sigops.org/s/conferences/sosp/2025/" class="conf"><b>SOSP 2025</b></a>
13+
</li>
14+
<li>
15+
<a target="_blank" href="#">Optimistic Recovery for High-Availability Software via Partial Process State Preservation</a><br>
16+
<span class="authorlist"><i><a href="https://osdi.dev" class="nodec">Yuzhuo Jing</a>, </i><i>Yuqi Mai, </i><i>Angting Cai, </i><i><a href="https://chenyi.world" class="nodec">Yi Chen</a>, </i><i><a href="https://hwanning.netlify.app" class="nodec">Wanning He</a>, </i><i>Xiaoyang Qian, </i><i><a href="https://web.eecs.umich.edu/~pmchen" class="nodec">Peter M. Chen</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
17+
<a target="_blank" href="https://sigops.org/s/conferences/sosp/2025/" class="conf"><b>SOSP 2025</b></a>
18+
</li>
19+
<li>
20+
<a target="_blank" href="#">Mitigating Application Resource Overload with Targeted Task Cancellation</a><br>
21+
<span class="authorlist"><i><a href="https://yigonghu.github.io" class="nodec">Yigong Hu</a>, </i><i>Zeyin Zhang, </i><i>Yicheng Liu, </i><i>Yile Gu, </i><i>Shuangyu Lei, </i><i><a href="https://homes.cs.washington.edu/~baris" class="nodec">Baris Kasikci</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
22+
<a target="_blank" href="https://sigops.org/s/conferences/sosp/2025/" class="conf"><b>SOSP 2025</b></a>
23+
</li>
24+
<li>
25+
<a target="_blank" href="paper/traincheck-osdi25-preprint.pdf">Training with Confidence: Catching Silent Errors in Deep Learning Training with Automated Proactive Checks</a><br>
1126
<span class="authorlist"><i><a href="https://essoz.github.io" class="nodec">Yuxuan Jiang</a>, </i><i>Ziming Zhou, </i><i>Boyu Xu, </i><i>Beijie Liu, </i><i>Runhui Xu, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
12-
<a target="_blank" href="https://www.usenix.org/conference/osdi25" class="conf"><b>OSDI 2025</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/traincheck-osdi25.bib">BibTeX</a>
13-
&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://github.com/OrderLab/TrainCheck">Software</a>
27+
<a target="_blank" href="https://www.usenix.org/conference/osdi25" class="conf"><b>OSDI 2025</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/traincheck-osdi25.bib">BibTeX</a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://github.com/OrderLab/TrainCheck">Software</a>&nbsp;&nbsp;<a target="_blank" role="button" class="btn btn-outline-primary publinkitem" href="https://www.arxiv.org/abs/2506.14813">[ArXiv]</a>
1428
</li>
1529
<li>
16-
<a target="_blank" href="#">Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems</a><br>
30+
<a target="_blank" href="paper/t2c-osdi25-preprint.pdf">Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems</a><br>
1731
<span class="authorlist"><i><a href="https://www.cs.jhu.edu/~chlou/about" class="nodec">Chang Lou</a>, </i><i>Dimas Shidqi Parikesit, </i><i>Yujin Huang, </i><i>Zhewen Yang, </i><i>Senapati Diwangkara, </i><i><a href="https://osdi.dev" class="nodec">Yuzhuo Jing</a>, </i><i>Achmad Imam Kistijantoro, </i><i><a href="http://www.eecg.toronto.edu/~yuan" class="nodec">Ding Yuan</a>, </i><i><a href="https://www.microsoft.com/en-us/research/people/sumann" class="nodec">Suman Nath</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
18-
<a target="_blank" href="https://www.usenix.org/conference/osdi25" class="conf"><b>OSDI 2025</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://github.com/OrderLab/T2C">Software</a>
32+
<a target="_blank" href="https://www.usenix.org/conference/osdi25" class="conf"><b>OSDI 2025</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/t2c-osdi25.bib">BibTeX</a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://github.com/OrderLab/T2C">Software</a>
1933
</li>
2034
<li>
2135
<a target="_blank" href="paper/xinda-nsdi25-preprint.pdf">One-Size-Fits-None: Understanding and Enhancing Slow-Fault Tolerance in Modern Distributed Systems</a><br>
@@ -50,13 +64,13 @@ <h2 id="publications">2023</h2>
5064
</li>
5165
<li>
5266
<a target="_blank" href="paper/pbox-sosp23.pdf">Pushing Performance Isolation Boundaries into Application with pBox</a><br>
53-
<span class="authorlist"><i><a href="https://www.cs.jhu.edu/~yigonghu" class="nodec">Yigong Hu</a>, </i><i><a href="https://gongqihuang.com" class="nodec">Gongqi Huang</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
67+
<span class="authorlist"><i><a href="https://yigonghu.github.io" class="nodec">Yigong Hu</a>, </i><i><a href="https://gongqihuang.com" class="nodec">Gongqi Huang</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
5468
<a target="_blank" href="https://sosp2023.mpi-sws.org" class="conf"><b>SOSP 2023</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/pbox-sosp23.bib">BibTeX</a>
5569
&nbsp;&nbsp;<a target="_blank" role="button" class="btn btn-outline-primary publinkitem" href="slides/pbox_sosp23_slides.pdf">Slides</a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://github.com/OrderLab/pBox">Software</a>
5670
</li>
5771
<li>
5872
<a target="_blank" href="paper/vprof-eurosys23.pdf">Effective Performance Issue Diagnosis with Value-Assisted Cost Profiling</a><br>
59-
<span class="authorlist"><i>Lingmei Weng, </i><i><a href="https://www.cs.jhu.edu/~yigonghu" class="nodec">Yigong Hu</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a>, </i><i><a href="http://www.cs.columbia.edu/~nieh" class="nodec">Jason Nieh</a>, </i><i><a href="http://www.cs.columbia.edu/~junfeng" class="nodec">Junfeng Yang</a><br></i></span>
73+
<span class="authorlist"><i>Lingmei Weng, </i><i><a href="https://yigonghu.github.io" class="nodec">Yigong Hu</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a>, </i><i><a href="http://www.cs.columbia.edu/~nieh" class="nodec">Jason Nieh</a>, </i><i><a href="http://www.cs.columbia.edu/~junfeng" class="nodec">Junfeng Yang</a><br></i></span>
6074
<a target="_blank" href="https://2023.eurosys.org" class="conf"><b>EuroSys 2023</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/vprof-eurosys23.bib">BibTeX</a>
6175
&nbsp;&nbsp;<a target="_blank" role="button" class="btn btn-outline-primary publinkitem" href="slides/vprof_eurosys23_slides.pdf">Slides</a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://github.com/wenglingmei/vprofAE">Software</a>
6276
</li>
@@ -104,7 +118,7 @@ <h2 id="publications">2020</h2>
104118
<ul class="publications">
105119
<li>
106120
<a target="_blank" href="paper/violet-osdi20.pdf">Automated Reasoning and Detection of Specious Configuration in Large Systems with Symbolic Execution</a><br>
107-
<span class="authorlist"><i><a href="https://www.cs.jhu.edu/~yigonghu" class="nodec">Yigong Hu</a>, </i><i><a href="https://gongqihuang.com" class="nodec">Gongqi Huang</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
121+
<span class="authorlist"><i><a href="https://yigonghu.github.io" class="nodec">Yigong Hu</a>, </i><i><a href="https://gongqihuang.com" class="nodec">Gongqi Huang</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
108122
<a target="_blank" href="https://www.usenix.org/conference/osdi20" class="conf"><b>OSDI 2020</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/violet-osdi20.bib">BibTeX</a>
109123
&nbsp;&nbsp;<a target="_blank" role="button" class="btn btn-outline-primary publinkitem" href="slides/violet_osdi20_slides.pdf">Slides</a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://github.com/OrderLab/violet">Software</a>&nbsp;&nbsp;<a target="_blank" role="button" class="btn btn-outline-primary publinkitem" href="paper/violet-tech-report.pdf">TechReport</a>
110124
</li>
@@ -126,7 +140,7 @@ <h2 id="publications">2020</h2>
126140
</li>
127141
<li>
128142
<a target="_blank" href="paper/sdig-aaai20-workshop.pdf">Scaling Performance Issue Detection and Diagnosis in Cloud Infrastructures</a><br>
129-
<span class="authorlist"><i><a href="https://www.cs.jhu.edu/~yigonghu" class="nodec">Yigong Hu</a>, </i><i>Ze Li, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a>, </i><i>Suhas Pinnamaneni, </i><i>Francis David, </i><i>Yingnong Dang, </i><i>Murali Chintalapati<br></i></span>
143+
<span class="authorlist"><i><a href="https://yigonghu.github.io" class="nodec">Yigong Hu</a>, </i><i>Ze Li, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a>, </i><i>Suhas Pinnamaneni, </i><i>Francis David, </i><i>Yingnong Dang, </i><i>Murali Chintalapati<br></i></span>
130144
<a target="_blank" href="https://cloudintelligenceworkshop.org" class="conf"><b>AAAI-20 Workshop on Cloud Intelligence</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/sdig-aaai20.bib">BibTeX</a>
131145
</li>
132146

@@ -146,7 +160,7 @@ <h2 id="publications">2019</h2>
146160
</li>
147161
<li>
148162
<a target="_blank" href="paper/leaseos-asplos19.pdf">A Case for Lease-Based, Utilitarian Resource Management on Mobile Devices</a>&nbsp;&nbsp;&nbsp;<b style="color:green">[Best Paper Award]</b><br>
149-
<span class="authorlist"><i><a href="https://www.cs.jhu.edu/~yigonghu" class="nodec">Yigong Hu</a>, </i><i><a href="https://sylll.github.io" class="nodec">Suyi Liu</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
163+
<span class="authorlist"><i><a href="https://yigonghu.github.io" class="nodec">Yigong Hu</a>, </i><i><a href="https://sylll.github.io" class="nodec">Suyi Liu</a>, </i><i><a href="https://web.eecs.umich.edu/~ryanph" class="nodec">Peng Huang</a><br></i></span>
150164
<a target="_blank" href="https://asplos-conference.org" class="conf"><b>ASPLOS 2019</b></a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="paper/leaseos.bib">BibTeX</a>
151165
&nbsp;&nbsp;<a target="_blank" role="button" class="btn btn-outline-primary publinkitem" href="slides/leaseos_asplos19_slides.pptx">Slides</a>&nbsp;&nbsp;<a target="_blank" class="btn btn-outline-primary publinkitem" href="https://orderlab.io/LeaseOS">Software</a><br><div class="press"><b>Coverage:</b> <a target="_blank" href="https://blog.acolyer.org/2019/05/31/lease-os">The Morning Paper</a> </div>
152166
</li>
@@ -280,5 +294,5 @@ <h2 id="publications">2010</h2>
280294
</li>
281295

282296
</ul>
283-
</div>
297+
</div>
284298
</section>

0 commit comments

Comments
 (0)