This is a patch release which introduces some hot fixes and new data in get_commits()
output.
repo_url
column to output of get_commits()
function (#535).verbose
mode is set to FALSE
(#525) and fixed checking token scopes for GitLab (#526).get_repos_urls()
output when individual repositories are set in set_*_host()
(#529). Earlier the function pulled all repositories for an organization, even though, repositories were defined for the host, not whole organizations. This is similar to the solved earlier (#439).This is a patch release which introduces some improvements in get_R_package_usage()
on speed and possibility to pull at once data on multiple R packages, new get_storage()
function and some fixes for checking token scopes and setting hosts.
get_R_package_usage()
function:
packages
parameter replacing old package_name
) (#494),split_output
parameter has been added - when set to TRUE
a list
with tibbles
(every element of the list
for every package) instead of one tibble
is returned.get_repos()
(#492). Earlier this was only possible for GitHub organizations and GitLab groups.get_storage()
function to retrieve data from GitStats
object - whole or particular datasets (e.g. commits
, repositories
or R_package_usage
) (#509).GitHost
is not passed to GitStats
. This also applies to situation when GitStats
looks for default tokens (not defined by user). Earlier, if tests for token failed, an empty token was passed and GitStats
was created, which was misleading for the user.github.com
or https://github.com
) to set_github_host()
(#475).{host_url}
, http://{host_url}
or https://{host_url}
) to host
parameter in `set_*_host() function (#399).This minor release comes up with new get_files_structure()
function and adjustments to get_files_content()
so user can pull custom (by defining pattern of files and depth of directories) files tree from repository and pull their content.
get_files_structure()
function to pull files structure for a given repository with possibility to control level of directories (depth
parameter) and to limit output to files matching regex argument passed to pattern
parameter (#338). Together with that, get_files()
function was renamed to get_files_content()
to better reflect its purpose.get_files_content()
so it can make use of files_structure
pulled to GitStats
storage with get_files_structure()
function - if file_path
is set to NULL
and use_files_structure()
parameter to TRUE
(both are by default)(#467).progress
parameter to user functions to control showing of cli
progress bar separately from messages (which are controlled with verbose
) (#465).orgs
nor repos
specified) from warning to info (#456).gh-pages
, lint and check for bumping version.This is a patch release with substantial improvements to some functions (get_repos()
, get_files()
and get_R_package_usage()
), adding with_files
and in_files
parameters, fixing cache
feature and introducing new get_repos_urls()
function, a minimalist version of get_repos()
:
get_repos_urls()
function to fetch repository URLs (either web or API - choose with type
parameter). It may return also only these repository URLs that consist of a given file or files (with passing argument to with_files
parameter) or a text in code blobs (with_code
parameter). This is a minimalist version of get_repos()
, which takes out all the process of parsing (search response into repositories one) and adding statistics on repositories. This makes it poorer with content but faster. (#425).with_files
parameter to get_repos()
function, which makes it possible to search for repositories with a given file or files and return full output for repositories.with_code
parameter (as a character vector) in get_repos()
and get_repos_urls()
(282).in_files
parameter to get_repos()
which works with with_code
parameter. When both are defined, GitStats
searches code blobs only in given files.dplyr::glimpse()
from get_*()
functions, so there is printing to console only if get_*()
function is not assigned to the object (#426).get_R_package_usage()
consists now also of repository full name (#438).get_R_package_usage()
with optimizing search of package names in DESCRIPTION
and NAMESPACE
files by removing filtering method and replacing it with filename:
filter directly in search endpoint query (#428).get_files()
when scanning scope is set to repositories
. Earlier, it pulled given files from whole organizations, even if scanning scope was set to repos
with set_*_host()
. Now it shows only files for the given repositories (#439).verbose
parameter controls now showing of the progress bars (#453).This is a patch release with some hot issues that needed to be addressed, notably covering set_*_host()
functions with verbose
control, tweaking a bit verbose
feature in general, fixing pulling data for GitLab subgroups and speeding up get_files()
function.
GitStats
is set to scan whole hosts, with switching to Search API
instead of pulling files via GraphQL
(with iteration over organizations and repositories) (#411).orgs
or repos
) GitStats does not pull no more all organizations. Pulling all organizations from host is triggered only when user decides to pull repositories from organizations. If he decides, e.g. to pull repositories by code, there is no need to pull all organizations (which may be a time consuming process), as GitStats uses then Search API
(#393).set_*_host()
functions with verbose_off()
or verbose
parameter (#413).verbose
to FALSE
does not lead to hiding output of the get_*()
functions - i.e. a glimpse of table will always appear after pulling data, even if the verbose
is switched off. verbose
parameter serves now only the purpose to show and hide messages to user (#423).set_*_host()
function (#415)This is a major release with general changes in workflow (simplifying it), changes in setting GitStats
hosts, deprecation of some not very useful features (like plots, setting parameters separately) and new get_release_logs()
function.
set_host()
function is replaced with more explicit set_github_host()
and set_gitlab_host()
(#373). If you wish to connect to public host (e.g. api.github.com
), you do not need to pass argument to host
parameter.repositories
, commits
, R_package_usage
or other you should use directly corresponding get_*()
functions instead of pull_*()
which are deprecated. These get_*()
functions pull data from API, parse it into table, add some goodies (additional columns) if needed and return table instead of GitStats
object, which in our opinion is more intuitive and user-friendly (#345). That means you do not need to run in pipe two or three additional function calls as before, e.g. pull_repos(gitstats_object) %>% get_repos() %>% get_repos_stats()
, but you just run
get_repos(gitstats_object)
to get data you need.get_*()
function GitStats
will pull the data from its storage and not from API as for the first time, unless you change parameters for the function (e.g. starting date with since
in get_commits()
) or change directly the cache
parameter in the function. (#333)pull_repos_contributors()
as a separate function is deprecated. The parameter add_contributors
is now set by default to TRUE
in get_repos()
which seems more reasonable as user gets all the data.get_commits()
old parameters (date_from
and date_until
) were replaced with new, more concise (since
and until
).set_params()
function is removed. (#386) Now the logic is moved straight to get_*()
functions. For example, if you want to pull repositories with specific code blob
, you do not need to define anything with set_params()
(as previously with search_mode
and phrase
parameter) but you just simply run get_repos(with_code = 'your_code')
. (#333)verbose
have been introduced for limiting messages to user when pulling data - this parameter can be set in all get_*()
functions. You can also turn the verbose mode on/off globally with verbose_on()
/verbose_off()
functions.get_repos_stats()
function was deprecated as its role was unclear - unlike get_commit_stats()
it did not aggregate repositories data into new stats table, but added only some new numeric columns, like number of contributors (contributors_n
) or last activity in difftime
format, which is now done within get_repos()
function.team
and filtering by language
is no longer supported - these features where quite heavy for the package performance and did not bring much added value. If user needs, he can always filter the output (formatted responses pulled from API) by contributors or language. (#384)GitStats
, they have been deprecated as the package is meant to be basically for back end purposes and this is the field where developer's effort should now go (#381). If needed and requested, plot functions may be brought up once more in next releases.get_release_logs()
(#356).get_orgs()
is renamed to show_orgs()
to reflect that it does not pull data from API, but only shows what is in GitStats
object.author_login
and author_name
(#332). This is due to the mix of GitHub/GitLab handles and display names in the author
column (the original author name
field in commits API response).GitStats
object - now when you return GitStats
object in console, it prints GitStats
data divided into sections to give more readable information to user: scanning scope
(organizations and repositories), and storage
(the output tables stored in GitStats
with basic information on dimensions) (#329).contributors
response (#331).gts_to_posixt()
helper which took dependencies on stringr
was a cause for some users of passing empty value to since
parameter to commits endpoint which ended in Bad Request Error (400) and infinite loop of retrying the response (#360).pull_R_package_usage()
with get_R_package_usage()
functions to pull repositories where package name is found in DESCRIPTION or NAMESPACE files or code blobs with phrases related to using an R package (library(package)
, require(package)
) (#326, #341),pull_files()
with get_files()
to pull content of text files (#200).GitStats
with set_host()
function by using repos
parameter instead of orgs
(#330).id
to repo_id
and name
to repo_name
,default_branch
column to repositories output as a consequence of #200.get_*_stats()
functions to prepare summary stats from pulled data: repositories and commits (#276),gitstats_plot()
which takes as an input repos_stats
or commits_stats
class objects (#276),get_*
to pull_*
; get_*
functions are now to retrieve already pulled data from GitStats object (#294),setup()
to set_params()
(#294),set_connection()
to set_host()
(#271),add_team_member()
to set_team_member()
(#271).GITHUB_PAT
or GITLAB_PAT
), there is no need to pass them as an argument to set_host()
(#120),pull_users()
function to pull information on users (#199),orgs
are passed (#258),get_orgs()
function to print all organizations (#283),reset()
function (#270)reset_language()
or setting language
parameter to All
in setup()
function (#231)contributors
as basic stat when pulling repos
by org
and by phrase
to improve speed of pulling repositories data. Added pull_repos_contributors()
user function and add_contributors
parameter to pull_repos()
function to add conditionally information on contributors to repositories table (#235)api_url
column as an address to the repository, not the host (#201),%>%
) (#289).This is the first release of GitStats with given features:
create_gitstats()
- creating GitStats object,set_connection()
- adding hosts to GitStats object,setup()
- setting search parameter to org, team or phrase, setting programming language of repositories,get_repos()
- pulling repositories from GitHub and GitLab API in a standardized table,get_commits()
- pulling commits from GitHub and GitLab API in a standardized table,set_team_member()
- adding team members to GitStats object.