4 The settings file and command line arguments

The settings file allows finer control of Genepop and/or batch processing. Further control is possible by using optional arguments when launching Genepop through the operating system command line, following the general syntax explained below for the settings file, e.g.

 Genepop EstimationPloidy=Haploid DifferentiationTest=Proba

Indeed, command line arguments are written in the file cmdline.txt, then this file is read much as the settings file.8

Henceforth, menu options are called options and batch file/command line options are called settings.

Running Genepop help will display the help information, which so far is no more than a list of available settings, loosely grouped semantically. A file showing all possible settings is the following:

 // sample Genepop settings file, showing all options.
 /*********** Syntax of this file:
 lines without 'equal' symbol are ignored (hence this one is).
 Lines beginning with a '/', /a '#' or a '%' are also ignored,
 even if they contain '=' (hence this one is).
 /*********** General options ***********
 Mode=Ask
 GenepopInputFile=sample.txt
 Dememorisation=10000
 BatchLength=5000
 BatchNumber=100
 //EstimationPloidy=Haploid
 //RandomSeed=12345678
 //MantelSeed=87654321
 /***     allele sizes stuff
 //AllelicDistance=Size
 AlleleSizes=1:5,2:10,3:15,10:50
 /*** selecting menu options
 MenuOptions=8
 /********** Option 1 (HW tests) ***********
 HWtests=Enumeration
 /           Emulating HW.BAT
 //HWFile=HWtest
 //HWfileOptions=4,3
 /********** Option 2 ("linkage" disequilibrium) ***********
 //          old Genepop behaviour
 /GameticDiseqTest=Proba
 /********** Option 3 (differentiation) ***********
 //          old Genepop behaviour
 /DifferentiationTest=Proba
 /           Emulating STRUC.BAT
 //strucFile=structest
 /********** Option 4 (private alleles) ***********
 //no specific setting, but may be affected
 //by the estimationPloidy setting
 /** Option 5 (basic information, Fis, gene diversities... )
 //no specific setting, but may be affected
 // by the AlleleSizes setting
 /***** Option 6 (F-statistics, isolation by distance) *****
 IsolationStatistic=e
 GeographicScale=Linear
 MinimalDistance=1
 CIcoverage=0.9
 testPoint=0.00123
 //MantelRankTest=
 /PopTypes= 1 2 1 2 3
 /PopTypeSelection= all
 //PhylipMatrix=
 /           Emulating ISOLDE
 //IsolationFile=Isoldetest
 /           Extending ISOLDE to multiple matrices
 //MultiMigFile=perlocusStuff
 / Isolation by distance with user-provided geographic distances
 //geoDistFile=someFile
 /********** Option 7 (file conversions) ***********
 //no specific setting
 /********** Option 8 (Various utilities) ***********
 NullAlleleMethod=ApparentNulls
 CIcoverage=0.9
 /******** Testing performance of some options *********
 // Option 6.x: options as above plus
 //Performance=aLinear
 //GenepopRootFile=file
 //JobMin=1
 //JobMax=100
 /********* Checking some limits of Genepop ***********
 //Maxima=

Each setting is specified following a Keyword=value syntax. Capitalisation is not important (it is here only to ease reading) except for file names if the operating system cares about it (as Linux does).

By default, Genepop seeks settings in the file Genepop.txt, but one can specify another settings file through the command line, as was shown in the session examples:

 Genepop settingsFile=SampleSettings.txt

The SettingsFile setting must be the first argument on the command line.

Settings specific to each menu option will be explained along with the description of each option. Settings affecting several menu options are the following:

GenepopInputFile (or simply InputFile )

which is the name of the input file in Genepop format

Dememorisation, BatchLength and BatchNumber

which are Markov Chain parameters, which meaning is explained in Section 7.3:

the dememorisation number The default is 10000;9 values below 100 are not allowed.

the number of batches The default is 20 for sub-options 1.4 and 1.5 (multisample HW tests), and 100 otherwise; values below 10 are not allowed.

the number of iterations per batch The default is 5000;10 values below 400 are not allowed.

The maximum allowed value of these parameters will depend on the C++ compiler (it is its maximum size_t, that is at least 65535, and typically much more on recent compilers). See the setting Maxima if you really need more information about this value.

EstimationPloidy

In multilocus estimates only diploid data are taken into account, unless the setting EstimationPloidy=Haploid is given, in which case only haploid data are taken into account. This setting applies to options 4 (private allele method), 5.2 and 5.3 (for multilocus estimates of gene diversities), and 6 (\(F\)-statistics and isolation by distance).

Mode

Genepop has three modes: Mode=Ask will ask for some feedback even in cases where the answer has been prespecified (e.g. through some setting; this may be useful when one wishes to change some settings in the course of a session). For example it will ask for confirmation of the MC parameters. Mode=Batch will not wait for feedback: execution of Genepop should complete without any user intervention. The third mode, Mode=Default (which in most cases does not need to be explicitly specified) will ask for unspecified settings but not request confirmation of prespecified ones, and will also pause and wait for feedback when some notable information is displayed.

MenuOptions

This tells Genepop to run the analyses as given through the menus: MenuOptions=1.1 will run option 1 sub-option 1 (test for heterozygote deficit), MenuOptions=1.1,2.2 will run option 1.1 then 2.2, and so on.

AllelicDistance=Size (or =AlleleSize)

This tells Genepop to use allele size-based statistics (where meaningful). Allele sizes are allele names unless specified by the next setting:

AlleleSizes

In the above example, the first such line AlleleSizes=1:5,2:10,3:15,10:50 says that at the first locus, allele 1 has size 5, allele 2 has size 10… 0 cannot be given a size since it means missing information. Any unlisted allele retain its name as its size. The second line specifies allele size at the second locus. The third line AlleleSizes= implies that at the third locus, all alleles retain their name as their size (don’t forget the ‘=’). It is needed only so that the next line AlleleSizes=1:5,2:10,3:15,10:50 refers to the fourth locus. As there are four AlleleSizes declarations, alleles retain their name as their size for any locus beyond the fourth one.

RandomSeed and MantelSeed One may change the seed of the pseudo-random number generator by the setting RandomSeed=value, except for the Mantel test for which the seed is given by the setting MantelSeed=value. The default value for both seeds is 67144630.

Maxima

With this setting, Genepop will only display some maximal values, including the maximum int and long int values for the compiler (the Markov chain dememorization and batch length are long int and the number of batches is int).


  1. Long command lines: under some old versions of Windows, the command line had a fairly limited maximum length, so it should have been used with moderation. This should no longer be a problem with recent versions of Windows, but who knows with Microsoft… one may try to find more information about command-line string limitation on support.microsoft.com.↩︎

  2. increased from Genepop 3.4’s default↩︎

  3. increased from Genepop 3.4’s default↩︎