Project Activity Reports

From KDE Community Wiki

Introduction

KDE Project Activity Reports is a web app that displays graphical reports about the activity of each of KDE's projects over time in addition to the activity of KDE overall. Activity includes number of contributions and number of active contributors and names of top contributors.
The app is built using Ruby on Rails. It was started in GSoC 2013 and work has been going on it since then. It was also included in GSoC 2014.
Currently, the app tracks bugs, git commits, mailing lists, IRC channels, social networks, forums, planet and dot. Some of the reports displayed are live and others might be up to 12 hours old. You can also view a list of latest activity of individual projects for some of the previous tracked sources. More details can be found in the upcoming section.

Graphical Reports

The app collects some data from different sources and processes them in order to produce graphical reports about the activity over time.

Line graphs have a "zoom" section on top, it contains some buttons; Each button displays numbers in a given time period, you can zoom until you see numbers per 1 day. However, when you zoom out, these numbers get grouped to be shown per month or year as appropriate for the zoom period you chose. You can also use the small unit that's just under the graph to zoom in to a certain period. For doers chart, like Authors, Senders, ... etc, you see 3 tabs, daily, monthly and yearly. This is due to the limitations of not being able to automatically group numbers in days to months or years. The default is daily and it is by default zoomed to 1 month. While you can zoom out to "all", that is not recommended for large projects as this produces a complex graph and makes the page slow. Monthly and yearly tabs are by default at "all" so it might be a better idea to use them if you want to see the activity over the whole time period.

Bar charts are used to display the top contributors in the last time period. If the number of contributors is very large(more than 100), you only see the top 10 and "others". If the number of contributors is relatively large(more than 15 but at most 100), you see the top 10 by default but there's a tab on top of the chart that allows you to display all contributors. If the number of contributors is small(at most 15), you see all the contributors by default.

This section contains the details of each of the produced reports.

Bugs

KDE uses Bugzilla as its bug tracker. The app uses Bugzilla's API to collect data and produces the reports according to the data collected. The app receives e-mails sent to kde-bugs-dist to keep the report live all the time.

Note: All bugs created before September 16th 2002 are not included in the report due to problems in retrieving their history.

Below is a detailed description of each of the graphs and charts.

To see a live example: http://reports.kde.org/en/projects/kde-community/bugs_report

At the top there's a small well which shows the average time to close bugs for the given project. This is the sum of the time taken to close each of the bugs for all bugs divided by the number of closed bugs.

Number of bugs per time period

This graph has 3 lines, one represents the number of reported bugs on that day, the second one represents the number of resolved bugs on that day and the third one represents the number of bugs opened that day and are still open till now. A reported bug is a bug whose creation time is on that day. A resolved bug is a bug which is closed and its last changed time is on that day.

Total number of bugs over time

This graph has 1 line that represents the total number of bugs that this project had by that point in time. Number in each day represents the number of bugs whose creation time are on or before that day.

Total number of open bugs over time

This graph has 1 line that represents the total number of open bugs that this project had by that point in time. Number in each day represents the number of closed bugs whose last changed time are on or before that day.

Average number of days to close bugs closed in less than a year over time

This graph has 1 line that represents the average number of days taken to close bugs but it discards from the graph any bug that was closed after 1 year of its creation date. This was done because a normal average number of days taken graph would always be going from high at old years to low at current years so this was used to just include bugs that were resolved in less than a year.

Number of bugs resolved after a year

This graph is to be used with the previous graph. It has 1 line that represents the number of bugs that were closed after 1 year or more of their creation date so that you can combine it with the previous graph to get a better picture of the performance.

Bugs pie charts

Each of the pie charts show the total number of bugs with a certain property. If the name of the pie chart starts with "open", then it counts open bugs only. If the name starts with "Resolved", then it counts resolved bugs only.

Commits

The app uses the git repositories to collect data needed to produce the reports. Note that commits with messages that have "SVN_SILENT" are ignored from all reports. The app receives e-mails sent to kde-commits and processes them to keep the commits report live for all projects but "KDE Community" as this one takes some time to regenerate and thus, regenerating it on every new commit is a waste of resources. Instead, it is set to be regenerated every 12 hours. This number might decrease soon. A well at the bottom of the report tells you when that report was last updated. Notice that merges of a branch into master does not trigger an e-mail so their might be a delay of up to 24 hours before the commits made in the branch appear in the report.

To see a live example: http://reports.kde.org/en/projects/kde-community/commits_report

Number of commits per time period

This graph has 1 line that represents the number of commits done on that day. The date of the commit is the date when it was authored.

Authors per time period

This graph has 1 line that represents the number of different authors that have authored at least 1 commit on each given time period.

Commits pie charts

Each of the pie charts shows the names of top authors in a given time period along with the total number of commits they've done in that period.

Mailing Lists

The app uses mailing lists mbox files to collect data needed to produce the reports. The app is also subscribed(manually) to the each of the mailing lists and processes their e-mails to keep the report live for all projects. Currently, the number of lists tracked is relatively small, so the "KDE Community" report is regenerated on every new message but this might change once the number of lists tracked becomes large.

To see a live example: http://reports.kde.org/en/projects/kde-community/mailing_lists_report

Number of messages per time period

This graph has 1 line that represents the number of messages sent on each day. The date used is the date in the date field of message.

Average number of hours to first reply

This graph has 1 line that represents average number of hours before threads get first reply. This only considers main threads and first replys only. Replys to replys are not considered.

Senders per time period

This graph has 1 line that represents the number of different senders that have sent at least 1 message on each given time period.

Mailing Lists pie charts

Each of the pie charts shows the names of top senders in a given time period along with the total number of messages they've sent in that period.

IRC Channels

Once an IRC channel is added to a project, a bot is sent to that channel. That bot notifies the app whenever a new message is sent. It only sends the name of the sender. This report is live for all projects. As the case with mailing lists, the number of currently tracked IRC channels is relatively small, so the "KDE Community" report is regenerated on every new message but this might change once the number of channels tracked becomes large.

To see a live example: http://reports.kde.org/en/projects/kde-community/irc_channels_report

Number of messages per time period

This graph has 1 line that represents the number of messages sent on each day.

Average number of hours to first reply

This graph has 1 line that represents the average number of hours to first reply. Detecting if this is a new conversation or a reply and calculating is not easy so this report is not as accurate as other average graphs.

The rules used to determine start of a conversation and reply

- Any message that starts with a nickname followed by a ":" is considered a reply and never considered a start of a new conversation.

- Any message that does not start with a nickname followed by a ":" is considered the start of a new conversation unless the person who wrote the message is the first replier in a conversation that started recently(currently defined as 10 minutes.) Note that this is only for the first replier, other repliers are not considered recently so if they write a message after replying without the format of a reply, it would be considered the start of a new conversation.

-A person who has already started a conversation can't start another conversation except if the first conversation has been replied to and after 10 minutes of its beginning.(If he writes another message in this period, it would be ignored.)

-A reply can be made only in less than a day(24 hours). If a person starts a conversation and after a day posts a new message, it would be considered a new conversation. However, till 24 hours and as long as his initial conversation is not replied to, any new message he posts would be ignored.

Senders per time period

This graph has 1 line that represents the number of different senders that have sent at least 1 message on each given time period.

IRC channels pie charts

Each of the pie charts shows the names of top senders in a given time period along with the total number of messages they've sent in that period.

Facebook Report

The Facebook report is based on posts shared by the page. Notes and Albums are not considered due to the limitations of the API.

The "KDE Community" project shows Facebook report for the "kde" page only so the Social Network reports are not like others in the sense of being the sum of all numbers in others.

You can see the report here: http://reports.kde.org/en/projects/kde-community/facebook_pages_report

The first graph shows number of posts and interactions for posts on that period. This means that numbers shown are related to posts themselves not to the time these actions happened. For example, if a user commented on a post a day after it has been posted, it would be shown in this group on the day of the post itself. This moves us to the second graph, which actually shows the number of comments done in a specific period based on the date of the comment itself. The third graph shows number of doers(likers & commenters) in a specific period. Commenters is based on dates they did the comment while likers are based on dates the post was written at as there's no way to find when they did the action of liking.

Twitter Report

Twitter API provides no way to access more than 100 retweeters of a specific retweet. kdecommunity page has only 3 tweets that got more that 100 retweets and 2 of them were actually retweets of tweets by Qt. This means that the numbers you see are not very accurate for retweets as there are some retweeters who were not counted but there's only one tweet by kdecommunity that has more than 100 retweets so it's not very far from accurate and things that happen from now on after using Twitter stream API should all be recorded and future tweets shouldn't have that problem.

kdecommunity page seems to do lots of retweets. Twitter API returns the number of retweets to kde's retweets as the total number of retweets, the number of favorites as always 0. I'm showing in the report the number of retweets as returned by the API(total number of retweets) and favorites as the total number of favorites that this post got... I'm showing retweets by the page in a separate graph. However, in doers part, I don't count retweeters of a retweeted tweet in doers line graph nor in retweeters pie charts.

The first graph you see is the tweets graph along with user interactions for these tweets. This is similar to Facebook's graph. The second graph shows the statistics for the retweets done by the page. The third is also similar to Facebook's comments graph, it shows the number of retweets of KDE's tweets with time. The fourth one shows retweeters over time and this is similar to the commenters line graph in Facebook.

You can see the report here: http://reports.kde.org/en/projects/kde-community/twitter_report

Google+ Report

Google+ API was actually one of the best. It provides all data needed to generate reports about all user interactions. The graphs shown in this report are similar to Facebook's and Twitter's graphs. The first one shows the posts made by the page and the total number of interactions posts in that period got. The second one shows replies with dates they were made at. Third one shows different number of unique doers in each period. Note that the resharers count is not correct. This is due to a problem in Google+ API. There is a reported issue about that but no response yet.

Google+ API does not provide any way to subscribe to changes as of now. Report needs to be regenerated (or at least the last few posts of it) periodically.

You can see the report here: http://reports.kde.org/en/projects/kde-community/google_plus_report

Forums Report

The forum report starts in 2014 because it's hard to get posts older than that. This report for the "KDE Community" project is truly for all KDE projects since it's easy to get data for all forums.

You can see the report here: https://reports.kde.org/en/projects/kde-community/forums_report

Numbers per time period

This graph has 3 lines, the number of threads, posts and resovled threads per time period. Note that resolved threads means the number of threads posted on that day and their state is now resolved and does not refer to the number of threads that got resolved on that day.

Average number of hours to first reply over time

This graph has 1 line which shows the average number of hours to first reply. Note that this graph starts from year 2015 because it was added in that year and posts of 2014 couldn't be all fetched.

Posters per time period

This graph has 1 line that shows the number of unique posters in each time period.

Forums pie charts

Each of the pie charts shows the names of top posters in a given time period along with the total number of posts they've sent in that period.

Planet Report

This report exists only for "KDE Community" project. The report starts from August 14th 2014. The line graphs show number of posts per time period and number of unique posters per time period. Pie charts show top posters in each time period. It is updated every hour.

You can see the report here: https://reports.kde.org/en/projects/kde-community/planet_report

Dot Report

This report exists only for "KDE Community" project. It is updated every 12 hours.

You can see the report here: https://reports.kde.org/en/projects/kde-community/dot_report