GitHub - unstppbl/gowap: Wappalyzer implementation in Go (original) (raw)

Gowap [Wappalyzer implementation in Go]

Build Status coverage report card

Usage

Using the package

go get github.com/unstppbl/gowap

Call Init() function with a Config object created with the NewConfig() function. It will return Wappalyzer object on which you can call Analyze method with URL string as argument.

//Create a Config object and customize it
config := gowap.NewConfig()
//Path to override default technologies.json file
config.AppsJSONPath = "path/to/my/technologies.json"
//Timeout in seconds for fetching the url
config.TimeoutSeconds = 5
//Timeout in seconds for loading the page
config.LoadingTimeoutSeconds = 5
//Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)
config.MaxDepth = 2
//Max number of pages to visit. Exit when reached
config.MaxVisitedLinks = 10
//Delay in ms between requests
config.MsDelayBetweenRequests = 200
//Choose scraper between rod (default) and colly
config.Scraper = "colly"
//Override the user-agent string
config.UserAgent = "GoWap"
//Output as a JSON string
config.JSON = true

//Initialisation
wapp, err := gowap.Init(config)
//Scraping 
url := "https://scrapethissite.com/"
res, err := wapp.Analyze(url)

Using the cmd

You can build the cmd using the commande :go build -o gowap cmd/gowap/main.go

Then using the compiled binary :

You must specify a url to analyse
Usage : gowap [options] <url>
  -delay int
        Delay in ms between requests (default 100)
  -depth int
        Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)
  -file string
        Path to override default technologies.json file
  -h	Help
  -loadtimeout int
        Timeout in seconds for loading the page (default 3)
  -maxlinks int
        Max number of pages to visit. Exit when reached (default 5)
  -pretty
        Pretty print json output
  -scraper string
        Choose scraper between rod (default) and colly (default "rod")
  -timeout int
        Timeout in seconds for fetching the url (default 3)
  -useragent string
        Override the user-agent string

To Do

List of some ideas :