Skip to main content

Download Analytics

The Podlove Publisher tracks download intents made by clients. It is only tracked that a download was started but not if it was completed. For brevity, this document will speak of “downloads”. Just be aware that what is tracked are actually only download intents. So when you are looking at your data, be aware that the numbers displayed do not represent the actual number of listening users.

Anatomy of Tracking URLs

The Publisher creates “tracking URLs”. For example, if your media file is this:

media.example.com/podcast/pod001.m4a

then a tracking URL might look like that:

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

Requests on tracking URLs are intercepted by the Publisher, analyzed, and finally redirected to the actual physical file. On to a closer look at the URL components.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

This is your blog domain and a “podlove” URL prefix so tracking URLs don’t interfere with your blogs pages.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

This identifies the actual file to download.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

These are indicators for the source (s) and context (c) of the Download. This allows you to have separate analytics for downloads from feeds, the web player, etc.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

What looks like the file name is purely of decorative nature. It makes the URL easier to read and some command line clients will use that part of the URL to auto-generate a filename. But it is irrelevant for the purpose of tracking.

Tracking Data

Only real download requests are tracked. In more technical terms: HEAD requests are ignored. These are the analyzed and saved values:

UA (User Agent)

Based on the UA, facts about the client can be derived or guessed; such as:

  • client name (Chrome, Firefox, iTunes, Instacast, …)
  • operating system (Android, iOS, Mac, …)
  • device (iPhone, Galaxy Nexus, …)

Furthermore, bots can be detected.

File ID

A reference to the downloaded file is kept. This allows to group tracking data by episode.

Request ID

The request ID is an artificial compound ID to identify identical requests anonymously. It is a hash from the combined IP address and user agent string.

Source & Context

Download source & context allows to analyze tracking data by how and where the file was requested. By default, the sources are:

  • download: a direct download via the website
  • feed: an automatic download via RSS feeds
  • webplayer: the episode was played using the player on the website

To further drill down, each source has multiple contexts:

  • download
    • select-button: a direct download by clicking a download button
    • select-show: the user obtained the URL by revealing it on the website
  • for feed, the asset is saved (m4a, mp3, ogg, etc.)
  • webplayer
    • episode: the player on the episode page
    • home: the player on the home page
    • website: the player on any other page

Geo Location

Based on the IP address and the GeoLite2 Database by MaxMind location data is saved.

Data Cleanup

Before tracking data is presented in the analytics area, it is cleaned up. Cleanup involves the following steps:

  • HEAD requests are ignored
  • requests for the first byte are ignored
  • Based on the User Agent analysis bots are filtered out.
  • Duplicate requests are filtered out. A request is considered a duplicate if it contains
    • the same File ID
    • and the same Request ID
    • and was made within the same hour (can be changed to day in settings)
  • Pre-Release downloads are filtered out. They may happen if you test downloads before publishing the episode.

IAB Conformity

Advertisers often ask for downloads numbers according to IAB Podcast Measurement Guidelines. The "data cleanup" section above explains how download numbers are treated, which is in fact according to the IAB guideline.

The only change you may need to do is set the deduplication window to a day instead of an hour, the Podlove Publisher default. You can do this in Podlove > Expert Settings > Tracking.

Database Structure

If you would like to access the raw analytics data and process it yourself, you need to know where what data is and what it means.

wp_podlove_downloadintent

You probably want to use wp_podlove_downloadintentclean instead. This table contains every tracked request. Columns are nearly identical to wp_podlove_downloadintentclean.

wp_podlove_downloadintentclean

This table contains cleaned data from wp_podlove_downloadintent. See section "Data Cleanup" for detail on how data is cleaned/aggregated. Each row represents one download. If you count the rows in this table you have the total number of downloads.

ColumnDescription
idUnique, auto-incrementing integer id
user_agent_idreference to id in wp_podlove_useragent
media_file_idreference to id in wp_podlove_mediafile
request_idArtificial compound ID to identify identical requests anonymously. It is a hash from the combined IP address and user agent string.
accessed_atDate and time of the download
sourceOne of "download", "feed", "webplayer". Specifies how the download was initiated.
contextA more specific description for the source column.
geo_area_idreference to id in wp_podlove_geoarea
latLocation: latitude
lngLocation: longitude
httprangeHTTP "Range" header of the request, which may specify which bytes exactly were requested.
hours_since_releaseAmount of hours between episode release and download. May be useful for aggregation.

wp_podlove_mediafile

This table holds metadata for each file, like the file size. It holds references to the episode and asset.

ColumnDescription
idUnique, auto-incrementing integer id
episode_idreference to id in wp_podlove_episode
episode_asset_idreference to id in wp_podlove_episodeasset
sizefile size in bytes. -1 or NULL if unknown.
etagetag used in HTTP requests

wp_podlove_episodeasset

This table holds metadata for assets. It is useful if you want to display asset names or need the reference to the file type.

ColumnDescription
idUnique, auto-incrementing integer id
titleAsset title
identifierAsset identifier for use in templates
file_type_idreference to id in wp_podlove_filetype
suffixUsed for constructing download URLs
downloadable1 if it appears in download dialogues, 0 otherwise.
positioninteger used for ordering

wp_podlove_filetype

This table holds metadata for the file type associated to an asset. It is useful if you need to display the file extension or filter all downloads by type == audio.

ColumnDescription
idUnique, auto-incrementing integer id
nameFiletype name
typeOne of "audio", "video", "ebook", "image", "chapters", "metadata", "transcript"
mime_typeMime Type
extensionFile extension

wp_podlove_episode

This table holds metadata for the episode. It is complemented by data in the WordPress-native table wp_posts, which you will need to access the episode title for example.

ColumnDescription
idUnique, auto-incrementing integer id
post_idreference to id in wp_posts
subtitleEpisode subtitle
summaryEpisode summary / description
...