Rethinking of the debian/watch

: subtitle

With thought experiments about uscan

: author

Kentaro Hayashi

: content-source

DebConf18 in Taiwan

: date

2018-08-03

: institution

ClearCode Inc.

: allotted-time

20m

: theme

.

Digest of this talk

Agenda

Agenda

Who I am?

# image
# relative-width = 10
# src = images/profile.png

Ad: ClearCode Inc.

# image
# relative-width = 50
# src = images/logo-combination-standard.svg

As a contributor

Agenda

Why playing with d/watch?

d/watch for fonts-sawarabi-mincho

version=4
opts="uversionmangle=s/-beta/~beta/;s/-rc/~rc/;s/-preview/~preview/, \
pagemangle=s%<osdn:file url="([^<]*)</osdn:file>%<a href="$1">$1</a>%g, \
downloadurlmangle=s%projects/sawarabi-fonts/downloads%frs/redir\.php?m=iij&f=sawarabi-fonts%g;s/xz\//xz/" \
https://osdn.net/projects/sawarabi-fonts/releases/rss \
https://osdn.net/projects/sawarabi-fonts/downloads/.*/sawarabi-mincho@ANY_VERSION@@ARCHIVE_EXT@/ debian uupdate

d/watch for fonts-sawarabi-mincho

pagemangle?

downloadurlmangle?

uversionmangle?

#899119

# blockquote
# title = #899119#5
Hideki Yamane:\n
"They sometimes changes download way to reduce download access
by preventing bot, so debian/watch file is complicated and it 
annoyed us. Implementing redirector in qa.debian.org would improve
this situation."

Motivation

Agenda

Introduction about debian/watch

The typical examples

Common mistakes to avoid

Common mistakes(1)

Common mistakes(2)

Common mistakes(3)

Common mistakes(4)

Common mistakes(5)

Common mistakes(6)

Common mistakes(7)

Common mistakes(8)

Impression about d/watch

Motivation again

Agenda

Why do we use statistics?

Collect d/watch data

sources.d.o API documentation

# image
# relative-height = 100
# src = images/sources-d-o-api-documentation.png

Collect package list

e.g. source package list

# image
# relative-height = 100
# src = images/sources-d-o-api-list-zoom.png

Collect package info

e.g. Groonga package info

# image
# relative-height = 100
# src = images/sources-d-o-api-src-groonga-zoom.png

Collect raw url

e.g. Groonga d/watch raw url

# image
# relative-height = 90
# src = images/sources-d-o-api-src-groonga-latest-debian-watch-zoom.png

Collect d/watch

e.g. Groonga d/watch

# image
# relative-width = 100
# src = images/sources-d-o-api-groonga-debian-watch-zoom.png

We are ready to collect data

How to collect it?

Parsing opts in d/watch

Analyzing system components

# image
# relative-height = 100
# src = images/system-components.png

NOTE

Some question about d/watch

Is watch file used?

# image
# relative-height = 100
# src = images/group-by-watch-file.png

What version are you using?

# image
# relative-height = 100
# src = images/group-by-watch-version.png

Top 5 hosting covers 58%

# image
# relative-height = 100
# src = images/group-by-top5all-hosting.png

Popular hosting?

# image
# relative-height = 100
# src = images/group-by-top5-hosting.png

These graphs show

What option is frequently used?

Not used option

Rarely used

Rarely used (2)

Sometimes used

Often used

What is the frequently used one?

# image
# relative-height = 100
# src = images/opts_frequency.png

Thought experiments d/watch

Required information?

The new syntax idea

e.g Diff between old and new rule

-version=4
+version=5

-opts=filenamemangle=s/.+\/v?(\d\S*)\.tar\.gz/fcitx-imlist-$1\.tar\.gz/
-  https://github.com/kenhys/fcitx-imlist/tags .*/v?(\d\S*)\.tar\.gz
+type=github.com,owner=kenhys,project=fcitx-imlist

e.g The new rule

version=5
type=github.com,owner=kenhys,project=fcitx-imlist

Pros

Cons

Experiments

Steps to verify

Dehs?

Test case

The new rule for GitHub

version=5
type=github.com,owner=kenhys,project=fcitx-imlist

How to modify uscan

How good enough new d/watch rule?

Conclusion

Q. What about fakeupstream.cgi?

Q. What about redirector?