GitHub - ckruse/microformats2-elixir: Microformats2 parser in Elixir (original) (raw)

Microformats2

Module Version Hex Docs Total Download License Last Updated

A Microformats2 parser for Elixir.

Installation

The package can be installed by adding :microformat2 to your list of dependencies in mix.exs:

def deps do [ {:microformats2, "~> 1.0.0"} ] end

If you want to directly parse from URLs, add :tesla to your list of dependencies in mix.exs:

def deps do [ {:microformats2, "> 1.0.0"}, {:tesla, "> 1.4.4"} ] end

Usage

Give the parser an HTML string and the URL it was fetched from:

Microformats2.parse("""

photo of Mitchell Mitchell Baker (@MitchellBaker) Mozilla Foundation

Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities.

Strategy Leadership
""", "http://example.org")

It will parse the object to a structure like that:

%{ "items" => [ %{ "properties" => %{ "category" => ["Strategy", "Leadership"], "name" => ["Mitchell Baker"], "note" => ["Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities."], "org" => ["Mozilla Foundation"], "photo" => [ %{ "alt" => "photo of Mitchell", "value" => "https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg" } ], "url" => ["http://blog.lizardwrangler.com/", "https://twitter.com/MitchellBaker"] }, "type" => ["h-card"] } ], "rel-urls" => %{}, "rels" => %{} }

You can also provide HTML trees already parsed with Floki:

Microformats2.parse(Floki.parse("<div class="h-card">..."), "http://example.org")

Or URLs if you have Tesla installed:

Microformats2.parse("http://example.org")

Dependencies

We need Floki for HTML parsing and optionallyTesla for fetching URLs.

Features

Implemented:

Not implemented:

Copyright (c) 2018 Christian Kruse cjk@defunct.ch

This work is free. You can redistribute it and/or modify it under the terms of the MIT License. See the LICENSE.md file for more details.