• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

atom/H30-Aug-2019-

extensions/H30-Aug-2019-

gofeed-0e68beaf6fdf/H20-Apr-2019-

internal/shared/H30-Aug-2019-

rss/H30-Aug-2019-

.gitignoreH A D30-Aug-2019294

.travis.ymlH A D30-Aug-2019589

LICENSEH A D30-Aug-20191 KiB

README.mdH A D30-Aug-201913.2 KiB

detector.goH A D30-Aug-2019995

feed.goH A D30-Aug-20193.6 KiB

go.modH A D30-Aug-2019394

go.sumH A D30-Aug-20191.8 KiB

parser.goH A D30-Aug-20193.4 KiB

translator.goH A D30-Aug-201919.2 KiB

README.md

1# gofeed
2
3[![Build Status](https://travis-ci.org/mmcdole/gofeed.svg?branch=master)](https://travis-ci.org/mmcdole/gofeed) [![Coverage Status](https://coveralls.io/repos/github/mmcdole/gofeed/badge.svg?branch=master)](https://coveralls.io/github/mmcdole/gofeed?branch=master) [![Go Report Card](https://goreportcard.com/badge/github.com/mmcdole/gofeed)](https://goreportcard.com/report/github.com/mmcdole/gofeed) [![](https://godoc.org/github.com/mmcdole/gofeed?status.svg)](http://godoc.org/github.com/mmcdole/gofeed) [![License](http://img.shields.io/:license-mit-blue.svg)](http://doge.mit-license.org)
4
5The `gofeed` library is a robust feed parser that supports parsing both [RSS](https://en.wikipedia.org/wiki/RSS) and [Atom](https://en.wikipedia.org/wiki/Atom_(standard)) feeds.  The library provides a universal `gofeed.Parser`  that will parse and convert all feed types into a hybrid `gofeed.Feed` model.  You also have the option of utilizing the feed specific `atom.Parser` or `rss.Parser` parsers which generate `atom.Feed` and `rss.Feed` respectively.
6
7## Table of Contents
8- [Features](#features)
9- [Overview](#overview)
10- [Basic Usage](#basic-usage)
11- [Advanced Usage](#advanced-usage)
12- [Extensions](#extensions)
13- [Invalid Feeds](#invalid-feeds)
14- [Default Mappings](#default-mappings)
15- [Dependencies](#dependencies)
16- [License](#license)
17- [Credits](#credits)
18
19## Features
20
21#### Supported feed types:
22* RSS 0.90
23* Netscape RSS 0.91
24* Userland RSS 0.91
25* RSS 0.92
26* RSS 0.93
27* RSS 0.94
28* RSS 1.0
29* RSS 2.0
30* Atom 0.3
31* Atom 1.0
32
33#### Extension Support
34
35The `gofeed` library provides support for parsing several popular predefined extensions into ready-made structs, including [Dublin Core](http://dublincore.org/documents/dces/) and [Apple’s iTunes](https://help.apple.com/itc/podcasts_connect/#/itcb54353390).
36
37It parses all other feed extensions in a generic way (see the [Extensions](#extensions) section for more details).
38
39#### Invalid Feeds
40
41A best-effort attempt is made at parsing broken and invalid XML feeds.  Currently, `gofeed` can succesfully parse feeds with the following issues:
42- Unescaped/Naked Markup in feed elements
43- Undeclared namespace prefixes
44- Missing closing tags on certain elements
45- Illegal tags within feed elements without namespace prefixes
46- Missing "required" elements as specified by the respective feed specs.
47- Incorrect date formats
48
49## Overview
50
51The `gofeed` library is comprised of a universal feed parser and several feed specific parsers.   Which one you choose depends entirely on your usecase.  If you will be handling both rss and atom feeds then it makes sense to use the `gofeed.Parser`.  If you know ahead of time that you will only be parsing one feed type then it would make sense to use `rss.Parser` or `atom.Parser`.
52
53#### Universal Feed Parser
54
55The universal `gofeed.Parser` works in 3 stages: detection, parsing and translation.  It first detects the feed type that it is currently parsing.  Then it uses a feed specific parser to parse the feed into its true representation which will be either a `rss.Feed` or `atom.Feed`.  These models cover every field possible for their respective feed types.  Finally, they are *translated* into a `gofeed.Feed` model that is a hybrid of both feed types.  Performing the universal feed parsing in these 3 stages allows for more flexibility and keeps the code base more maintainable by separating RSS and Atom parsing into seperate packages.
56
57![Diagram](docs/sequence.png)
58
59The translation step is done by anything which adheres to the `gofeed.Translator` interface.  The `DefaultRSSTranslator` and `DefaultAtomTranslator` are used behind the scenes when you use the `gofeed.Parser` with its default settings.  You can see how they translate fields from ```atom.Feed``` or ```rss.Feed``` to the universal ```gofeed.Feed``` struct in the [Default Mappings](#default-mappings) section.  However, should you disagree with the way certain fields are translated you can easily supply your own `gofeed.Translator` and override this behavior.  See the [Advanced Usage](#advanced-usage) section for an example how to do this.
60
61#### Feed Specific Parsers
62
63The `gofeed` library provides two feed specific parsers: `atom.Parser` and `rss.Parser`.  If the hybrid `gofeed.Feed` model that the universal `gofeed.Parser` produces does not contain a field from the `atom.Feed` or `rss.Feed` model that you require, it might be beneficial to use the feed specific parsers.  When using the `atom.Parser` or `rss.Parser` directly, you can access all of fields found in the `atom.Feed` and `rss.Feed` models.  It is also marginally faster because you are able to skip the translation step.
64
65## Basic Usage
66
67#### Universal Feed Parser
68
69The most common usage scenario will be to use ```gofeed.Parser``` to parse an arbitrary RSS or Atom feed into the hybrid ```gofeed.Feed``` model.  This hybrid model allows you to treat RSS and Atom feeds the same.
70
71##### Parse a feed from an URL:
72
73```go
74fp := gofeed.NewParser()
75feed, _ := fp.ParseURL("http://feeds.twit.tv/twit.xml")
76fmt.Println(feed.Title)
77```
78
79##### Parse a feed from a string:
80
81```go
82feedData := `<rss version="2.0">
83<channel>
84<title>Sample Feed</title>
85</channel>
86</rss>`
87fp := gofeed.NewParser()
88feed, _ := fp.ParseString(feedData)
89fmt.Println(feed.Title)
90```
91
92##### Parse a feed from an io.Reader:
93
94```go
95file, _ := os.Open("/path/to/a/file.xml")
96defer file.Close()
97fp := gofeed.NewParser()
98feed, _ := fp.Parse(file)
99fmt.Println(feed.Title)
100```
101
102#### Feed Specific Parsers
103
104You can easily use the `rss.Parser` and `atom.Parser` directly if you have a usage scenario that requires it:
105
106##### Parse a RSS feed into a `rss.Feed`
107
108```go
109feedData := `<rss version="2.0">
110<channel>
111<webMaster>example@site.com (Example Name)</webMaster>
112</channel>
113</rss>`
114fp := rss.Parser{}
115rssFeed, _ := fp.Parse(strings.NewReader(feedData))
116fmt.Println(rssFeed.WebMaster)
117```
118
119##### Parse an Atom feed into a `atom.Feed`
120
121```go
122feedData := `<feed xmlns="http://www.w3.org/2005/Atom">
123<subtitle>Example Atom</subtitle>
124</feed>`
125fp := atom.Parser{}
126atomFeed, _ := fp.Parse(strings.NewReader(feedData))
127fmt.Println(atomFeed.Subtitle)
128```
129
130## Advanced Usage
131
132##### Parse a feed while using a custom translator
133
134The mappings and precedence order that are outlined in the [Default Mappings](#default-mappings) section are provided by the following two structs: `DefaultRSSTranslator` and `DefaultAtomTranslator`.  If you have fields that you think should have a different precedence, or if you want to make a translator that is aware of an unsupported extension you can do this by specifying your own RSS or Atom translator when using the `gofeed.Parser`.
135
136Here is a simple example of creating a custom `Translator` that makes the `/rss/channel/itunes:author` field have a higher precedence than the `/rss/channel/managingEditor` field in RSS feeds.  We will wrap the existing `DefaultRSSTranslator` since we only want to change the behavior for a single field.
137
138First we must define a custom translator:
139
140```go
141
142import (
143    "fmt"
144
145    "github.com/mmcdole/gofeed"
146    "github.com/mmcdole/gofeed/rss"
147)
148
149type MyCustomTranslator struct {
150    defaultTranslator *gofeed.DefaultRSSTranslator
151}
152
153func NewMyCustomTranslator() *MyCustomTranslator {
154  t := &MyCustomTranslator{}
155
156  // We create a DefaultRSSTranslator internally so we can wrap its Translate
157  // call since we only want to modify the precedence for a single field.
158  t.defaultTranslator = &gofeed.DefaultRSSTranslator{}
159  return t
160}
161
162func (ct* MyCustomTranslator) Translate(feed interface{}) (*gofeed.Feed, error) {
163	rss, found := feed.(*rss.Feed)
164	if !found {
165		return nil, fmt.Errorf("Feed did not match expected type of *rss.Feed")
166	}
167
168  f, err := ct.defaultTranslator.Translate(rss)
169  if err != nil {
170    return nil, err
171  }
172
173  if rss.ITunesExt != nil && rss.ITunesExt.Author != "" {
174      f.Author = rss.ITunesExt.Author
175  } else {
176      f.Author = rss.ManagingEditor
177  }
178  return f
179}
180```
181
182Next you must configure your `gofeed.Parser` to utilize the new `gofeed.Translator`:
183
184```go
185feedData := `<rss version="2.0">
186<channel>
187<managingEditor>Ender Wiggin</managingEditor>
188<itunes:author>Valentine Wiggin</itunes:author>
189</channel>
190</rss>`
191
192fp := gofeed.NewParser()
193fp.RSSTranslator = NewMyCustomTranslator()
194feed, _ := fp.ParseString(feedData)
195fmt.Println(feed.Author) // Valentine Wiggin
196```
197
198## Extensions
199
200Every element which does not belong to the feed's default namespace is considered an extension by `gofeed`.  These are parsed and stored in a tree-like structure located at `Feed.Extensions` and `Item.Extensions`.  These fields should allow you to access and read any custom extension elements.
201
202In addition to the generic handling of extensions, `gofeed` also has built in support for parsing certain popular extensions into their own structs for convenience.  It currently supports the [Dublin Core](http://dublincore.org/documents/dces/) and [Apple iTunes](https://help.apple.com/itc/podcasts_connect/#/itcb54353390) extensions which you can access at `Feed.ItunesExt`, `feed.DublinCoreExt` and `Item.ITunesExt` and `Item.DublinCoreExt`
203
204## Default Mappings
205
206The ```DefaultRSSTranslator``` and the ```DefaultAtomTranslator``` map the following ```rss.Feed``` and ```atom.Feed``` fields to their respective ```gofeed.Feed``` fields.  They are listed in order of precedence (highest to lowest):
207
208
209`gofeed.Feed` | RSS | Atom
210--- | --- | ---
211Title | /rss/channel/title<br>/rdf:RDF/channel/title<br>/rss/channel/dc:title<br>/rdf:RDF/channel/dc:title | /feed/title
212Description | /rss/channel/description<br>/rdf:RDF/channel/description<br>/rss/channel/itunes:subtitle | /feed/subtitle<br>/feed/tagline
213Link | /rss/channel/link<br>/rdf:RDF/channel/link | /feed/link[@rel=”alternate”]/@href<br>/feed/link[not(@rel)]/@href
214FeedLink | /rss/channel/atom:link[@rel="self"]/@href<br>/rdf:RDF/channel/atom:link[@rel="self"]/@href | /feed/link[@rel="self"]/@href
215Updated | /rss/channel/lastBuildDate<br>/rss/channel/dc:date<br>/rdf:RDF/channel/dc:date | /feed/updated<br>/feed/modified
216Published | /rss/channel/pubDate |
217Author | /rss/channel/managingEditor<br>/rss/channel/webMaster<br>/rss/channel/dc:author<br>/rdf:RDF/channel/dc:author<br>/rss/channel/dc:creator<br>/rdf:RDF/channel/dc:creator<br>/rss/channel/itunes:author | /feed/author
218Language | /rss/channel/language<br>/rss/channel/dc:language<br>/rdf:RDF/channel/dc:language | /feed/@xml:lang
219Image | /rss/channel/image<br>/rdf:RDF/image<br>/rss/channel/itunes:image | /feed/logo
220Copyright | /rss/channel/copyright<br>/rss/channel/dc:rights<br>/rdf:RDF/channel/dc:rights | /feed/rights<br>/feed/copyright
221Generator | /rss/channel/generator | /feed/generator
222Categories | /rss/channel/category<br>/rss/channel/itunes:category<br>/rss/channel/itunes:keywords<br>/rss/channel/dc:subject<br>/rdf:RDF/channel/dc:subject | /feed/category
223
224
225`gofeed.Item` | RSS | Atom
226--- | --- | ---
227Title | /rss/channel/item/title<br>/rdf:RDF/item/title<br>/rdf:RDF/item/dc:title<br>/rss/channel/item/dc:title | /feed/entry/title
228Description | /rss/channel/item/description<br>/rdf:RDF/item/description<br>/rss/channel/item/dc:description<br>/rdf:RDF/item/dc:description | /feed/entry/summary
229Content | /rss/channel/item/content:encoded | /feed/entry/content
230Link | /rss/channel/item/link<br>/rdf:RDF/item/link | /feed/entry/link[@rel=”alternate”]/@href<br>/feed/entry/link[not(@rel)]/@href
231Updated | /rss/channel/item/dc:date<br>/rdf:RDF/rdf:item/dc:date | /feed/entry/modified<br>/feed/entry/updated
232Published | /rss/channel/item/pubDate<br>/rss/channel/item/dc:date | /feed/entry/published<br>/feed/entry/issued
233Author | /rss/channel/item/author<br>/rss/channel/item/dc:author<br>/rdf:RDF/item/dc:author<br>/rss/channel/item/dc:creator<br>/rdf:RDF/item/dc:creator<br>/rss/channel/item/itunes:author | /feed/entry/author
234GUID |  /rss/channel/item/guid | /feed/entry/id
235Image | /rss/channel/item/itunes:image<br>/rss/channel/item/media:image |
236Categories | /rss/channel/item/category<br>/rss/channel/item/dc:subject<br>/rss/channel/item/itunes:keywords<br>/rdf:RDF/channel/item/dc:subject | /feed/entry/category
237Enclosures | /rss/channel/item/enclosure | /feed/entry/link[@rel=”enclosure”]
238
239## Dependencies
240
241* [goxpp](https://github.com/mmcdole/goxpp) - XML Pull Parser
242* [goquery](https://github.com/PuerkitoBio/goquery) - Go jQuery-like interface
243* [testify](https://github.com/stretchr/testify) - Unit test enhancements
244
245## License
246
247This project is licensed under the [MIT License](https://raw.githubusercontent.com/mmcdole/gofeed/master/LICENSE)
248
249## Credits
250
251* [cristoper](https://github.com/cristoper) for his work on implementing xml:base relative URI handling.
252* [Mark Pilgrim](https://en.wikipedia.org/wiki/Mark_Pilgrim) and [Kurt McKee](http://kurtmckee.org) for their work on the excellent [Universal Feed Parser](https://github.com/kurtmckee/feedparser) Python library.  This library was the inspiration for the `gofeed` library.
253* [Dan MacTough](http://blog.mact.me) for his work on [node-feedparser](https://github.com/danmactough/node-feedparser).  It provided inspiration for the set of fields that should be covered in the hybrid `gofeed.Feed` model.
254* [Matt Jibson](https://mattjibson.com/) for his date parsing function in the [goread](https://github.com/mjibson/goread) project.
255* [Jim Teeuwen](https://github.com/jteeuwen) for his method of representing arbitrary feed extensions in the [go-pkg-rss](https://github.com/jteeuwen/go-pkg-rss) library.
256