Download Analytics

The Podlove Publisher tracks download intents made by clients. It is only tracked that a download was started but not if it was completed. For brevity, this document will speak of “downloads”. Just be aware that what is tracked are actually only download intents. So when you are looking at your data, be aware that the numbers displayed do not represent the actual number of listening users.

Anatomy of Tracking URLs

The Publisher creates “tracking URLs”. For example, if your media file is this:

media.example.com/podcast/pod001.m4a

then a tracking URL might look like that:

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

Requests on tracking URLs are intercepted by the Publisher, analyzed, and finally redirected to the actual physical file. On to a closer look at the URL components.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

This is your blog domain and a “podlove” URL prefix so tracking URLs don’t interfere with your blogs pages.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

This identifies the actual file to download.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

These are indicators for the source (s) and context (c) of the Download. This allows you to have separate analytics for downloads from feeds, the web player, etc.

example.com/podlove/file/646/s/feed/c/m4a/pod001.m4a

What looks like the file name is purely of decorative nature. It makes the URL easier to read and some command line clients will use that part of the URL to auto-generate a filename. But it is irrelevant for the purpose of tracking.

Tracking Data

Only real download requests are tracked. In more technical terms: HEAD requests are ignored. These are the analyzed and saved values:

UA (User Agent)

Based on the UA, facts about the client can be derived or guessed; such as:

client name (Chrome, Firefox, iTunes, Instacast, …)
operating system (Android, iOS, Mac, …)
device (iPhone, Galaxy Nexus, …)

Furthermore, bots can be detected.

File ID

A reference to the downloaded file is kept. This allows to group tracking data by episode.

Request ID

The request ID is an artificial compound ID to identify identical requests anonymously. It is a hash from the combined IP address and user agent string.

Source & Context

Download source & context allows to analyze tracking data by how and where the file was requested. By default, the sources are:

download: a direct download via the website
feed: an automatic download via RSS feeds
webplayer: the episode was played using the player on the website

To further drill down, each source has multiple contexts:

download
- select-button: a direct download by clicking a download button
- select-show: the user obtained the URL by revealing it on the website
for feed, the asset is saved (m4a, mp3, ogg, etc.)
webplayer
- episode: the player on the episode page
- home: the player on the home page
- website: the player on any other page

Geo Location

Based on the IP address and the GeoLite2 Database by MaxMind location data is saved.

Data Cleanup

Before tracking data is presented in the analytics area, it is cleaned up. Cleanup involves the following steps:

HEAD requests are ignored
requests for the first byte are ignored
Based on the User Agent analysis bots are filtered out.
Duplicate requests are filtered out. A request is considered a duplicate if it contains
- the same File ID
- and the same Request ID
- and was made within the same hour (can be changed to day in settings)
Pre-Release downloads are filtered out. They may happen if you test downloads before publishing the episode.

IAB Conformity

Advertisers often ask for downloads numbers according to IAB Podcast Measurement Guidelines. The "data cleanup" section above explains how download numbers are treated, which is in fact according to the IAB guideline.

The only change you may need to do is set the deduplication window to a day instead of an hour, the Podlove Publisher default. You can do this in Podlove > Expert Settings > Tracking.

Database Structure

If you would like to access the raw analytics data and process it yourself, you need to know where what data is and what it means.

wp_podlove_downloadintent

You probably want to use wp_podlove_downloadintentclean instead. This table contains every tracked request. Columns are nearly identical to wp_podlove_downloadintentclean.

wp_podlove_downloadintentclean

This table contains cleaned data from wp_podlove_downloadintent. See section "Data Cleanup" for detail on how data is cleaned/aggregated. Each row represents one download. If you count the rows in this table you have the total number of downloads.

Column	Description
id	Unique, auto-incrementing integer id
user_agent_id	reference to id in `wp_podlove_useragent`
media_file_id	reference to id in `wp_podlove_mediafile`
request_id	Artificial compound ID to identify identical requests anonymously. It is a hash from the combined IP address and user agent string.
accessed_at	Date and time of the download
source	One of "download", "feed", "webplayer". Specifies how the download was initiated.
context	A more specific description for the source column.
geo_area_id	reference to id in `wp_podlove_geoarea`
lat	Location: latitude
lng	Location: longitude
httprange	HTTP "Range" header of the request, which may specify which bytes exactly were requested.
hours_since_release	Amount of hours between episode release and download. May be useful for aggregation.

wp_podlove_mediafile

This table holds metadata for each file, like the file size. It holds references to the episode and asset.

Column	Description
id	Unique, auto-incrementing integer id
episode_id	reference to id in `wp_podlove_episode`
episode_asset_id	reference to id in `wp_podlove_episodeasset`
size	file size in bytes. `-1` or `NULL` if unknown.
etag	etag used in HTTP requests

wp_podlove_episodeasset

This table holds metadata for assets. It is useful if you want to display asset names or need the reference to the file type.

Column	Description
id	Unique, auto-incrementing integer id
title	Asset title
identifier	Asset identifier for use in templates
file_type_id	reference to id in `wp_podlove_filetype`
suffix	Used for constructing download URLs
downloadable	`1` if it appears in download dialogues, `0` otherwise.
position	integer used for ordering

wp_podlove_filetype

This table holds metadata for the file type associated to an asset. It is useful if you need to display the file extension or filter all downloads by type == audio.

Column	Description
id	Unique, auto-incrementing integer id
name	Filetype name
type	One of "audio", "video", "ebook", "image", "chapters", "metadata", "transcript"
mime_type	Mime Type
extension	File extension

wp_podlove_episode

This table holds metadata for the episode. It is complemented by data in the WordPress-native table wp_posts, which you will need to access the episode title for example.

Column	Description
id	Unique, auto-incrementing integer id
post_id	reference to id in `wp_posts`
subtitle	Episode subtitle
summary	Episode summary / description
...

Anatomy of Tracking URLs​

Tracking Data​

UA (User Agent)​

File ID​

Request ID​

Source & Context​

Geo Location​

Data Cleanup​

IAB Conformity​

Database Structure​

wp_podlove_downloadintent​

wp_podlove_downloadintentclean​

wp_podlove_mediafile​

wp_podlove_episodeasset​

wp_podlove_filetype​

wp_podlove_episode​