Catalog
Data Models
all types of catalog items inherits from Item
which stores as multi-table django model.
one Item
may have multiple ExternalResource
s, each represents one page on an external site
classDiagram
class Item {
<<abstract>>
}
Item <|-- Album
class Album {
+String barcode
+String Douban_ID
+String Spotify_ID
}
Item <|-- Game
class Game {
+String Steam_ID
}
Item <|-- Podcast
class Podcast {
+String feed_url
+String Apple_ID
}
Item <|-- Performance
Item <|-- Work
class Work {
+String Douban_Work_ID
+String Goodreads_Work_ID
}
Item <|-- Edition
Item <|-- Series
Series *-- Work
Work *-- Edition
class Series {
+String Goodreads_Series_ID
}
class Work {
+String Douban_ID
+String Goodreads_ID
}
class Edition{
+String ISBN
+String Douban_ID
+String Goodreads_ID
+String GoogleBooks_ID
}
Item <|-- Movie
Item <|-- TVShow
Item <|-- TVSeason
Item <|-- TVEpisode
TVShow *-- TVSeason
TVSeason *-- TVEpisode
class TVShow{
+String IMDB_ID
+String TMDB_ID
}
class TVSeason{
+String Douban_ID
+String TMDB_ID
}
class TVEpisode{
+String IMDB_ID
+String TMDB_ID
}
class Movie{
+String Douban_ID
+String IMDB_ID
+String TMDB_ID
}
Item <|-- Collection
ExternalResource --* Item
class ExternalResource {
+enum site
+url: string
}
Add a new site
- If official API is available for the site, it should be the preferred way to get data.
- add a new value to
IdType
andSiteName
incatalog/common/models.py
- add a new file in
catalog/sites/
, a new class inheritsAbstractSite
, with:SITE_NAME
ID_TYPE
URL_PATTERNS
WIKI_PROPERTY_ID
(not used now)DEFAULT_MODEL
(unless specified inscrape()
return val)- a classmethod
id_to_url()
- a method
scrape()
returns aResourceContent
objectBasicDownloader
orProxiedDownloader
can used to download website content or API data. e.g.content = BasicDownloader(url).download().html()
- check out existing files in
catalog/sites/
for more examples
- add an import in
catalog/sites/__init__.py
- add some tests to
catalog/<folder>/tests.py
according to site type- add
DOWNLOADER_SAVEDIR = '/tmp'
tosettings.py
can save all response to /tmp - run
neodb-manage cat <url>
for debugging or saving response file to/tmp
. Detailed code ofcat
is incatalog/management/commands/cat.py
- move captured response file to
test_data/
, except large/images files. Or if have to, replace it with a smallest version (e.g. 1x1 pixel / 1s audio) - add
@use_local_response
decorator to test methods that should pick up these responses (ifBasicDownloader
orProxiedDownloader
is used)
- add
- run all the tests and make sure they pass
- Command:
neodb-manage python3 manage.py test [--keepdb]
. - See this issue if
lxml.etree.ParserError
occurs on macOS.
- Command:
- add a site UI label style to
common/static/scss/_sitelabel.scss