grab package - github.com/cavaliercoder/grab - Go Packages (original) (raw)
Package grab provides a HTTP download manager implementation.
Get is the most simple way to download a file:
resp, err := grab.Get("/tmp", "http://example.com/example.zip") // ...
Get will download the given URL and save it to the given destination directory. The destination filename will be determined automatically by grab using Content-Disposition headers returned by the remote server, or by inspecting the requested URL path.
An empty destination string or "." means the transfer will be stored in the current working directory.
If a destination file already exists, grab will assume it is a complete or partially complete download of the requested file. If the remote server supports resuming interrupted downloads, grab will resume downloading from the end of the partial file. If the server does not support resumed downloads, the file will be retransferred in its entirety. If the file is already complete, grab will return successfully.
For control over the HTTP client, destination path, auto-resume, checksum validation and other settings, create a Client:
client := grab.NewClient() client.HTTPClient.Transport.DisableCompression = true
req, err := grab.NewRequest("/tmp", "http://example.com/example.zip") // ... req.NoResume = true req.HTTPRequest.Header.Set("Authorization", "Basic YWxhZGRpbjpvcGVuc2VzYW1l")
resp := client.Do(req) // ...
You can monitor the progress of downloads while they are transferring:
client := grab.NewClient() req, err := grab.NewRequest("", "http://example.com/example.zip") // ... resp := client.Do(req)
t := time.NewTicker(time.Second) defer t.Stop()
for { select { case <-t.C: fmt.Printf("%.02f%% complete\n", resp.Progress())
case <-resp.Done:
if err := resp.Err(); err != nil {
// ...
}
// ...
return
}}
func GetBatch(workers int, dst string, urlStrs ...string) (<-chan *Response, error)
- func (c *Response) BytesComplete() int64
- func (c *Response) BytesPerSecond() float64
- func (c *Response) Cancel() error
- func (c *Response) Duration() time.Duration
- func (c *Response) ETA() time.Time
- func (c *Response) Err() error
- func (c *Response) IsComplete() bool
- func (c *Response) Progress() float64
- func (c *Response) Wait()
This section is empty.
var (
ErrBadLength = [errors](/errors).[New](/errors#New)("bad content length")
ErrBadChecksum = [errors](/errors).[New](/errors#New)("checksum mismatch")
ErrNoFilename = [errors](/errors).[New](/errors#New)("no filename could be determined")
ErrNoTimestamp = [errors](/errors).[New](/errors#New)("no timestamp could be determined for the remote file")
ErrFileExists = [errors](/errors).[New](/errors#New)("file exists"))
DefaultClient is the default client and is used by all Get convenience functions.
GetBatch sends multiple HTTP requests and downloads the content of the requested URLs to the given destination directory using the given number of concurrent worker goroutines.
The Response for each requested URL is sent through the returned Response channel, as soon as a worker receives a response from the remote server. The Response can then be used to track the progress of the download while it is in progress.
The returned Response channel will be closed by Grab, only once all downloads have completed or failed.
If an error occurs during any download, it will be available via call to the associated Response.Err.
For control over HTTP client headers, redirect policy, and other settings, create a Client instead.
IsStatusCodeError returns true if the given error is of type StatusCodeError.
A Client is a file download client.
Clients are safe for concurrent use by multiple goroutines.
NewClient returns a new file download Client, using default configuration.
func (c *Client) Do(req *Request) *Response
Do sends a file transfer request and returns a file transfer response, following policy (e.g. redirects, cookies, auth) as configured on the client's HTTPClient.
Like http.Get, Do blocks while the transfer is initiated, but returns as soon as the transfer has started transferring in a background goroutine, or if it failed early.
An error is returned via Response.Err if caused by client policy (such as CheckRedirect), or if there was an HTTP protocol or IO error. Response.Err will block the caller until the transfer is completed, successfully or otherwise.
client := NewClient() req, err := NewRequest("/tmp", "http://example.com/example.zip") if err != nil { panic(err) }
resp := client.Do(req) if err := resp.Err(); err != nil { panic(err) }
fmt.Println("Download saved to", resp.Filename)
func (c Client) DoBatch(workers int, requests ...Request) <-chan *Response
DoBatch executes all the given requests using the given number of concurrent workers. Control is passed back to the caller as soon as the workers are initiated.
If the requested number of workers is less than one, a worker will be created for every request. I.e. all requests will be executed concurrently.
If an error occurs during any of the file transfers it will be accessible via call to the associated Response.Err.
The returned Response channel is closed only after all of the given Requests have completed, successfully or otherwise.
// create multiple download requests reqs := make([]*Request, 0) for i := 0; i < 10; i++ { url := fmt.Sprintf("http://example.com/example%d.zip", i+1) req, err := NewRequest("/tmp", url) if err != nil { panic(err) } reqs = append(reqs, req) }
// start downloads with 4 workers client := NewClient() respch := client.DoBatch(4, reqs...)
// check each response for resp := range respch { if err := resp.Err(); err != nil { panic(err) }
fmt.Printf("Downloaded %s to %s\n", resp.Request.URL(), resp.Filename)}
func (c *Client) DoChannel(reqch <-chan *Request, respch chan<- *Response)
DoChannel executes all requests sent through the given Request channel, one at a time, until it is closed by another goroutine. The caller is blocked until the Request channel is closed and all transfers have completed. All responses are sent through the given Response channel as soon as they are received from the remote servers and can be used to track the progress of each download.
Slow Response receivers will cause a worker to block and therefore delay the start of the transfer for an already initiated connection - potentially causing a server timeout. It is the caller's responsibility to ensure a sufficient buffer size is used for the Response channel to prevent this.
If an error occurs during any of the file transfers it will be accessible via the associated Response.Err function.
This example uses DoChannel to create a Producer/Consumer model for downloading multiple files concurrently. This is similar to how DoBatch uses DoChannel under the hood except that it allows the caller to continually send new requests until they wish to close the request channel.
// create a request and a buffered response channel reqch := make(chan *Request) respch := make(chan *Response, 10)
// start 4 workers client := NewClient() wg := sync.WaitGroup{} for i := 0; i < 4; i++ { wg.Add(1) go func() { client.DoChannel(reqch, respch) wg.Done() }() }
go func() { // send requests for i := 0; i < 10; i++ { url := fmt.Sprintf("http://example.com/example%d.zip", i+1) req, err := NewRequest("/tmp", url) if err != nil { panic(err) } reqch <- req } close(reqch)
// wait for workers to finish
wg.Wait()
close(respch)}()
// check each response for resp := range respch { // block until complete if err := resp.Err(); err != nil { panic(err) }
fmt.Printf("Downloaded %s to %s\n", resp.Request.URL(), resp.Filename)}
A Hook is a user provided callback function that can be called by grab at various stages of a requests lifecycle. If a hook returns an error, the associated request is canceled and the same error is returned on the Response object.
Hook functions are called synchronously and should never block unnecessarily. Response methods that block until a download is complete, such as Response.Err, Response.Cancel or Response.Wait will deadlock. To cancel a download from a callback, simply return a non-nil error.
RateLimiter is an interface that must be satisfied by any third-party rate limiters that may be used to limit download transfer speeds.
A recommended token bucket implementation can be found athttps://godoc.org/golang.org/x/time/rate#Limiter.
req, _ := NewRequest("", "http://www.golang-book.com/public/pdf/gobook.pdf")
// Attach a 1Mbps rate limiter, like the token bucket implementation from // golang.org/x/time/rate. req.RateLimiter = NewLimiter(1048576)
resp := DefaultClient.Do(req) if err := resp.Err(); err != nil { log.Fatal(err) }
A Request represents an HTTP file transfer request to be sent by a Client.
NewRequest returns a new file transfer Request suitable for use with Client.Do.
Context returns the request's context. To change the context, use WithContext.
The returned context is always non-nil; it defaults to the background context.
The context controls cancelation.
SetChecksum sets the desired hashing algorithm and checksum value to validate a downloaded file. Once the download is complete, the given hashing algorithm will be used to compute the actual checksum of the downloaded file. If the checksums do not match, an error will be returned by the associated Response.Err method.
If deleteOnError is true, the downloaded file will be deleted automatically if it fails checksum validation.
To prevent corruption of the computed checksum, the given hash must not be used by any other request or goroutines.
To disable checksum validation, call SetChecksum with a nil hash.
// create download request req, err := NewRequest("", "http://example.com/example.zip") if err != nil { panic(err) }
// set request checksum sum, err := hex.DecodeString("33daf4c03f86120fdfdc66bddf6bfff4661c7ca11c5da473e537f4d69b470e57") if err != nil { panic(err) } req.SetChecksum(sha256.New(), sum, true)
// download and validate file resp := DefaultClient.Do(req) if err := resp.Err(); err != nil { panic(err) }
URL returns the URL to be downloaded.
WithContext returns a shallow copy of r with its context changed to ctx. The provided ctx must be non-nil.
// create context with a 100ms timeout ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond) defer cancel()
// create download request with context req, err := NewRequest("", "http://example.com/example.zip") if err != nil { panic(err) } req = req.WithContext(ctx)
// send download request resp := DefaultClient.Do(req) if err := resp.Err(); err != nil { fmt.Println("error: request cancelled") }
Output:
error: request cancelled
Response represents the response to a completed or in-progress download request.
A response may be returned as soon a HTTP response is received from a remote server, but before the body content has started transferring.
All Response method calls are thread-safe.
Get sends a HTTP request and downloads the content of the requested URL to the given destination file path. The caller is blocked until the download is completed, successfully or otherwise.
An error is returned if caused by client policy (such as CheckRedirect), or if there was an HTTP protocol or IO error.
For non-blocking calls or control over HTTP client headers, redirect policy, and other settings, create a Client instead.
// download a file to /tmp resp, err := Get("/tmp", "http://example.com/example.zip") if err != nil { log.Fatal(err) }
fmt.Println("Download saved to", resp.Filename)
func (c *Response) BytesComplete() int64
BytesComplete returns the total number of bytes which have been copied to the destination, including any bytes that were resumed from a previous download.
BytesPerSecond returns the number of bytes transferred in the last second. If the download is already complete, the average bytes/sec for the life of the download is returned.
Cancel cancels the file transfer by canceling the underlying Context for this Response. Cancel blocks until the transfer is closed and returns any error - typically context.Canceled.
Duration returns the duration of a file transfer. If the transfer is in process, the duration will be between now and the start of the transfer. If the transfer is complete, the duration will be between the start and end of the completed transfer process.
ETA returns the estimated time at which the the download will complete, given the current BytesPerSecond. If the transfer has already completed, the actual end time will be returned.
Err blocks the calling goroutine until the underlying file transfer is completed and returns any error that may have occurred. If the download is already completed, Err returns immediately.
func (c *Response) IsComplete() bool
IsComplete returns true if the download has completed. If an error occurred during the download, it can be returned via Err.
Progress returns the ratio of total bytes that have been downloaded. Multiply the returned value by 100 to return the percentage completed.
func (c *Response) Wait()
Wait blocks until the download is completed.
StatusCodeError indicates that the server response had a status code that was not in the 200-299 range (after following any redirects).