Project Activity Reports

From KDE Community Wiki
 
Under Construction
This is a new page, currently under construction!


Introduction

KDE Project Activity Reports is a web app that displays graphical reports about the activity of each of KDE's projects over time in addition to the activity of KDE overall. Activity includes number of contributions and number of active contributors and names of top contributors.
The app is built using Ruby on Rails. It was started in GSoC 2013 and work has been going on it since then. It was also included in GSoC 2014.
Currently, the app tracks bugs, git commits, mailing lists, IRC channels, social networks, forums, planet and dot. Some of the reports displayed are live and others might be up to 12 hours old. You can also view a list of latest activity of individual projects for some of the previous tracked sources. More details can be found in the upcoming section.

Graphical Reports

The app collects some data from different sources and processes them in order to produce graphical reports about the activity over time.

Line graphs have a "zoom" section on top, it contains some buttons; Each button displays numbers in a given time period, you can zoom until you see numbers per 1 day. However, when you zoom out, these numbers get grouped to be shown per month or year as appropriate for the zoom period you chose. You can also use the small unit that's just under the graph to zoom in to a certain period. For doers chart, like Authors, Senders, ... etc, you see 3 tabs, daily, monthly and yearly. This is due to the limitations of not being able to automatically group numbers in days to months or years. The default is daily and it is by default zoomed to 1 month. While you can zoom out to "all", that is not recommended for large projects as this produces a complex graph and makes the page slow. Monthly and yearly tabs are by default at "all" so it might be a better idea to use them if you want to see the activity over the whole time period.

Bar charts are used to display the top contributors in the last time period. If the number of contributors is very large(more than 100), you only see the top 10 and "others". If the number of contributors is relatively large(more than 15 but at most 100), you see the top 10 by default but there's a tab on top of the chart that allows you to display all contributors. If the number of contributors is small(at most 15), you see all the contributors by default.

This section contains the details of each of the produced reports.

Bugs

KDE uses Bugzilla as its bug tracker. The app uses Bugzilla's API to collect data and produces the reports according to the data collected. The app receives e-mails sent to kde-bugs-dist to keep the report live all the time. Below is a detailed description of each of the graphs and charts.

To see a live example: http://reports.kde.org/en/projects/kde-community/bugs_report

At the top there's a small well which shows the average time to close bugs for the given project. This is the sum of the time taken to close each of the bugs for all bugs divided by the number of closed bugs.

Number of bugs per time period

This graph has 3 lines, one represents the number of reported bugs on that day, the second one represents the number of resolved bugs on that day and the third one represents the number of bugs opened that day and are still open till now. A reported bug is a bug whose creation time is on that day. A resolved bug is a bug which is closed and its last changed time is on that day.

Total number of bugs over time

This graph has 1 line that represents the total number of bugs that this project had by that point in time. Number in each day represents the number of bugs whose creation time are on or before that day.

Total number of open bugs over time

This graph has 1 line that represents the total number of open bugs that this project had by that point in time. Number in each day represents the number of closed bugs whose last changed time are on or before that day.

Average number of days to close bugs closed in less than a year over time

This graph has 1 line that represents the average number of days taken to close bugs but it discards from the graph any bug that was closed after 1 year of its creation date. This was done because a normal average number of days taken graph would always be going from high at old years to low at current years so this was used to just include bugs that were resolved in less than a year.

Number of bugs resolved after a year

This graph is to be used with the previous graph. It has 1 line that represents the number of bugs that were closed after 1 year or more of their creation date so that you can combine it with the previous graph to get a better picture of the performance.

Bugs pie charts

Each of the pie charts show the total number of bugs with a certain property. If the name of the pie chart starts with "open", then it counts open bugs only. If the name starts with "Resolved", then it counts resolved bugs only.

Commits

The app uses the git repositories to collect data needed to produce the reports. Note that commits with messages that have "SVN_SILENT" are ignored from all reports. The app receives e-mails sent to kde-commits and processes them to keep the commits report live for all projects but "KDE Community" as this one takes some time to regenerate and thus, regenerating it on every new commit is a waste of resources. Instead, it is set to be regenerated every 12 hours. This number might decrease soon. A well at the bottom of the report tells you when that report was last updated. Notice that merges of a branch into master does not trigger an e-mail so their might be a delay of up to 24 hours before the commits made in the branch appear in the report.

To see a live example: http://reports.kde.org/en/projects/kde-community/commits_report

Number of commits per time period

This graph has 1 line that represents the number of commits done on that day. The date of the commit is the date when it was authored.

Authors per time period

This graph has 1 line that represents the number of different authors that have authored at least 1 commit on each given time period.

Commits pie charts

Each of the pie charts shows the names of top authors in a given time period along with the total number of commits they've done in that period.

Mailing Lists

The app uses mailing lists mbox files to collect data needed to produce the reports. The app is also subscribed(manually) to the each of the mailing lists and processes their e-mails to keep the report live for all projects. Currently, the number of lists tracked is relatively small, so the "KDE Community" report is regenerated on every new message but this might change once the number of lists tracked becomes large.

To see a live example: http://reports.kde.org/en/projects/kde-community/mailing_lists_report

Number of messages per time period

This graph has 1 line that represents the number of messages sent on each day. The date used is the date in the date field of message.

Average number of hours to first reply

This graph has 1 line that represents average number of hours before threads get first reply. This only considers main threads and first replys only. Replys to replys are not considered.

Senders per time period

This graph has 1 line that represents the number of different senders that have sent at least 1 message on each given time period.

Mailing Lists pie charts

Each of the pie charts shows the names of top senders in a given time period along with the total number of messages they've sent in that period.

IRC Channels

Once an IRC channel is added to a project, a bot is sent to that channel. That bot notifies the app whenever a new message is sent. It only sends the name of the sender. This report is live for all projects. As the case with mailing lists, the number of currently tracked IRC channels is relatively small, so the "KDE Community" report is regenerated on every new message but this might change once the number of channels tracked becomes large.

To see a live example: http://reports.kde.org/en/projects/kde-community/irc_channels_report

Number of messages per time period

This graph has 1 line that represents the number of messages sent on each day.

Average number of hours to first reply

This graph has 1 line that represents the average number of hours to first reply. Detecting if this is a new conversation or a reply and calculating is not easy so this report is not as accurate as other average graphs.

The rules used to determine start of a conversation and reply

- Any message that starts with a nickname followed by a ":" is considered a reply and never considered a start of a new conversation. - Any message that does not start with a nickname followed by a ":" is considered the start of a new conversation unless the person who wrote the message is the first replier in a conversation that started recently(currently defined as 10 minutes.) Note that this is only for the first replier, other repliers are not considered recently so if they write a message after replying without the format of a reply, it would be considered the start of a new conversation. -A person who has already started a conversation can't start another conversation except if the first conversation has been replied to and after 10 minutes of its beginning.(If he writes another message in this period, it would be ignored.) -A reply can be made only in less than a day(24 hours). If a person starts a conversation and after a day posts a new message, it would be considered a new conversation. However, till 24 hours and as long as his initial conversation is not replied to, any new message he posts would be ignored.


Facebook Report

It seems like Facebook has more than one type of posts that a page can do... Common is the normal posts, then there's the notes like this one https://www.facebook.com/notes/kde/plasma-workspaces-wallpaper-contest/10150415266129281

Notice how the url says that it's a note. Notes API is disabled in v2.0 and higher of Facebook API. Luckily, it's still available in v1.0 which can be used up to April 30th, 2015... Enough to gather information about previous notes... It appears that KDE stopped writing notes after the one linked above(About Mid November 2011). So, we shouldn't need to use that part of the API again after the data is collected for the first time. The problems is, even for those notes using API v1.0, I can't get the number of likes nor shares. The report I'm generating considers notes as normal posts but doesn't include shared albums(i.e: I loop through notes and posts APIs only while generating data).

Also, there are 2 types of tokens. One for normal users and one for apps. The one I use to generate the report is an app token. I've noticed some minor differences. For instance, have a look at this page: https://www.facebook.com/kde/posts/10150508291223918

This appears in the posts API if I use the apps token while it does not seem to appear if I use a user API. This is not actually a post, it's a comment but a one that's made by the KDE page itself. The effect of this on the generated report is that it would increase the number of posts made in January 2011 by 1. I'm assuming these are minor bugs that would have no effect on the use of the report. If you prefer, I can use a user token which would solve this.

The "KDE Community" project shows Facebook report for the "kde" page only so the Social Network reports are not like others in the sense of being the sum of all numbers in others.

You can see the report here: http://reports.kde.org/en/projects/kde-community/facebook_pages_report

The first graph shows number of posts and interactions for posts on that period. This means that numbers shown are related to posts themselves not to the time these actions happened. For example, if a user commented on a post a day after it has been posted, it would be shown in this group on the day of the post itself. This moves us to the second graph, which actually shows the number of comments done in a specific period based on the date of the comment itself. The third graph shows number of doers(likers & commenters) in a specific period. Commenters is based on dates they did the comment while likers are based on dates the post was written at as there's no way to find when they did the action of liking.

Under that, you can see the pie charts. There is a tab for commenters and a tab for likers. Facebook API doesn't give any information about resharers. That's why there's no graph for them.

Twitter Report

Twitter API provides no way to access more than 100 retweeters of a specific retweet. kdecommunity page has only 3 tweets that got more that 100 retweets and 2 of them were actually retweets of tweets by Qt. This means that the numbers you see are not very accurate for retweets as there are some retweeters who were not counted but there's only one tweet by kdecommunity that has more than 100 retweets so it's not very far from accurate and things that happen from now on after using Twitter stream API should all be recorded and future tweets shouldn't have that problem.

kdecommunity page seems to do lots of retweets. Twitter API returns the number of retweets to kde's retweets as the total number of retweets, the number of favorites as always 0. I'm showing in the report the number of retweets as returned by the API(total number of retweets) and favorites as the total number of favorites that this post got... I'm showing retweets by the page in a separate graph. However, in doers part, I don't count retweeters of a retweeted tweet in doers line graph nor in retweeters pie charts.

The first graph you see is the tweets graph along with user interactions for these tweets. This is similar to Facebook's graph. The second graph shows the statistics for the retweets done by the page. The third is also similar to Facebook's comments graph, it shows the number of retweets of KDE's tweets with time. The fourth one shows retweeters over time and this is similar to the commenters line graph in Facebook. Also Pie charts are the same as Facebook's ones.

You can see the report here: http://reports.kde.org/en/projects/kde-community/twitter_report

Google+ Report

Google+ API was actually one of the best. It provides all data needed to generate reports about all user interactions. The graphs shown in this report are similar to Facebook's and Twitter's graphs. The first one shows the posts made by the page and the total number of interactions posts in that period got. The second one shows replies with dates they were made at. Third one shows different number of unique doers in each period. Pie charts are available for 3 categories, plusoners(+1ers), repliers and commenters.

Google+ API does not provide any way to subscribe to changes as of now. Report needs to be regenerated (or at least the last few posts of it) periodically.

You can see the report here: http://reports.kde.org/en/projects/kde-community/google_plus_report