Provides WARC paths for commoncrawl.org. To be used with spark_read_warc.
spark_read_warc
cc_warc(start, end = start)
start
The first path to retrieve.
end
The last path to retrieve.
cc_warc(1) cc_warc(2, 3)