Incomplete Slicer package download - `gzip: stdin: invalid compressed data--format violated`

Thank you, the file that you got was interesting. The file has the correct size (373537536 bytes). The first 43883184 bytes were correctly downloaded. After that point, the file content restarts from the beginning.

This indicates that the download was interrupted (it can happen for example if the server loses its patience because your network connection is slow) and then your download client thought that it could resume the interrupted download from where it was left off, but the server did not support this and just just provided the bytes from the beginning.

With some help from bing-chat, I checked if the Slicer download server supports resume:

import requests

url = 'https://slicer-packages.kitware.com/api/v1/item/63f5bee68939577d9867b4c7/download'
headers = {'Range': 'bytes=0-4'}
res = requests.head(url, headers=headers)

if res.status_code == 206:
    print('Resume download is supported')
else:
    print('Resume download is not supported')

The result was: Resume download is supported

Just to confirm, I’ve printed the header:

>>> res.headers
{'Server': 'nginx',
'Date': 'Mon, 05 Jun 2023 00:48:28 GMT',
'Content-Type': 'application/octet-stream',
'Content-Length': '373537536',
'Connection': 'keep-alive',
'Allow': 'DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT',
'Girder-Request-Uid': '93225529-db4f-4337-8055-363bcc8e7bb9',
'Accept-Ranges': 'bytes',
'Content-Disposition': 'attachment; filename="Slicer-5.2.2-linux-amd64.tar.gz"',
'Content-Range': 'bytes 0-373537535/373537536',
'Strict-Transport-Security': 'max-age=63072000'}

Again, 'Accept-Ranges': 'bytes' confirmed that the server can resume a download.

After this, I’ve tested if the server can actually accepts a range, by requesting file content in the range of byte 10-20:

headers = {'Range': 'bytes=10-20'}
res = requests.get(url, headers=headers)
with open('c:/tmp/slicerpackage.bin', 'wb') as f:
    f.write(res.content)

This script downloaded the entire Slicer package and wrote it to file - from byte 0!

The response header tells that the server actually did not respect the range request (see Content-range):

>>> res.headers
{'Server': 'nginx', 'Date': 'Mon, 05 Jun 2023 00:56:57 GMT', 'Content-Type': 'application/octet-stream', 'Content-Length': '373537536', 'Connection': 'keep-alive', 'Allow': 'DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT', 'Girder-Request-Uid': '9bbfbc0b-c732-4ee7-a057-975401fd5fcb', 'Accept-Ranges': 'bytes', 'Content-Disposition': 'attachment; filename="Slicer-5.2.2-linux-amd64.tar.gz"', 'Content-Range': 'bytes 0-373537535/373537536', 'Strict-Transport-Security': 'max-age=63072000'}
>>> res.status_code
206

Arguably, the download client should have known better that it did not get what it asked for and use the received bytes appropriately, but I think overall the server behavior is confusing and probably incorrect. Even more so because the server returned status code of 206 = “partial content”, which is again wrong, as it provided the full content again.

Therefore, it seems that the problem is that the Slicer download server states that it supports resume, but then it ignores the requested content range information in the header and provides the full file content.