Functions¶
The following tables contain all of the functions that can be used in WHERE and CALC expressions in GOR queries.
|
Returns the result of a GOR/NOR query |
Boolean Functions¶
Certain functions in GOR can be used to return a Boolean result (i.e. 0 for false, 1 for true) or a conditional result. By their nature, Boolean-type functions cannot be used with all CALC and WHERE commands (as other functions can). For example, the IF function returns one of two values based on a condition set in the function. As a result, IF can only be used with CALC commands.
Function |
Description |
More info |
---|---|---|
|
Returns the second parameter if the condition in the first parameter is true, otherwise the third parameter. |
|
|
Test if a column has integer value. |
|
|
Test if a column has floating point value. |
|
|
Returns true if the comma separated list has an element in the lit-list. |
|
|
Case-sensitive version of CONTAINSANY |
|
|
Returns true if the string contains all the string literals in the list. Not case-sensitive. |
|
|
Returns true if the string contains all the string literals in the list. Not case-sensitive. Same as CONTAINS, CONTAINSALL. |
|
|
Returns true if the string contains any of the string literals in the list. Not case-sensitive. Same as CONTAINS, CONTAINSALL. |
|
|
Returns the number of elements in the comma separated list that are in the lit-list. |
|
|
Case-sensitive version of CONTAINS, CONTAINSALL. |
|
|
Returns true if the string contains all the string literals in the list. Not case-sensitive. Same as CONTAINS, CONTAINSALL. |
|
|
Returns true if the string contains any of the string literals in the list. Not case-sensitive. Same as CONTAINS, CONTAINSALL. |
Type and String Conversion¶
Function |
Description |
More info |
---|---|---|
|
Convert number to string. |
|
|
Same as STR. |
|
|
Convert string to float. |
|
|
Convert string to float with default value. |
|
|
Same as FLOAT. |
|
|
Convert string to integer. |
|
|
Converts an integer number to base 26 representation. |
|
|
Converts an integer number to PN-like ID, e.g. AAAAAAA |
|
|
The length of the string. |
|
|
The string reversed. |
|
|
Trims whitespace from the beginning and end of a string. |
|
|
The MD5 message-digest of the string. |
|
|
In order of appearance, i.e. returns an integer number based on the order in which str value shows up. |
|
|
Upper-case a string. |
|
|
Lower-case a string. |
|
|
Return the specified left-most characters in a string. |
|
|
Return the specified right-most characters in a string. |
|
|
A substring. A zero-based range, not including the last position. |
|
|
A substring. A zero-based starting position and size. |
|
|
Replace from a string a pattern with the new string. |
|
|
The position within a string of a string. -1 if not found. |
|
|
Adds brackets around the string, e.g. LISTMAP(LISTZIP(col1,col2),’BRACKETS(x)’). |
|
|
Removes surrounding brackets from a string. |
|
|
Adds single quotes around the string, e.g. LISTMAP(LISTZIP(col1,col2),’SQUOTE(x)’). |
|
|
Removes surrounding single quotes from a string. |
|
|
Adds double quotes around the string, e.g. LISTMAP(LISTZIP(col1,col2),’DQUOTE(x)’). |
|
|
Removes surrounding double quotes from a string. |
|
|
Format a number, specifying the space and decimals (C-style %x.yf format style). |
|
|
Format a number, specifying the space and decimals (C-style %x.ye format style). |
|
|
Format a number, specifying the space and decimals (C-style %x.yg format style). |
|
|
Extract a single substring based on regular expression binding pattern, e.g. with single brackets. |
|
|
Decode (map) the string, given key-value mapping pairs. |
Algebraic Functions¶
Function |
Description |
More info |
---|---|---|
|
Floor value, e.g. the largest integer that is smaller or equal. |
|
|
Ceiling value, e.g. the smallest integer that is larger or equal. |
|
|
The closest integer. |
|
|
The quotient in integer division. |
|
|
The remainder in integer division. |
|
|
Raise the first parameter to the power of the second. |
|
|
Minimum. |
|
|
Maximum. |
|
|
Square root. |
|
|
Square. |
|
|
The absolute value, e.g. abs(x) = |x|. |
|
|
10-based logarithm. |
|
|
Natural logarithm. |
|
|
The natural exponent, e.g. exp(x) = e^x. |
|
|
A random floating point number. |
|
|
The distance between two segments. Returns 0 if segments overlap. |
|
|
Determines if two segments overlap. |
Trigonometric Functions¶
Function |
Description |
More info |
---|---|---|
|
Sine based on radians. |
|
|
Cosine based on radians. |
|
|
Arc-sine or inverse sine. |
|
|
Arc-cosine. |
|
|
Tangents based on radians. |
|
|
Arc-tangents. |
Statistical Functions¶
Function |
Description |
More info |
---|---|---|
|
The chi-squared value in a 2x2 table. |
chi_2x2 |
|
Chi-square complement. |
chi2 |
|
Chi-square |
|
|
Chi-square complement. Same as CHI2. |
chi2 |
|
The 2-sided Fisher-exact or Chi-square based p-val for a 2x2 table. |
pval |
|
The one-sided Fisher-exact or Chi-square based p-val for a 2x2 table. |
pvalone |
|
Student T distribution. (degrees of freedom, integration end point) |
student |
|
Inverse Student T distribution (alpha, size). |
invstudent |
|
Normal distribution |
normal |
|
Inverse Normal distribution. |
invnormal |
|
Poisson distribution. |
poission |
|
Poisson complement. See http://acs.lbl.gov/software/colt/api/cern/jet/stat/Probability.html |
poissonc |
List Functions¶
The table below shows a complete list of functions for dealing with lists in the GOR query language.
The first input value for any list function is the source of the list. This can be explicitly set by either by listing elements in quotation marks or by referencing a column in the input stream. The other input values vary depending on the function. String inputs must be input
With functions that allow a sep
or separator value, note that the separator value in the list is assumed to be a comma and the separator value is optional. If a separator other than a comma is used in the list, the separator value should be defined in quotation marks.
Click on “More info” for any of the functions below to get more information about how to use the function in a GOR query.
Function |
Description |
More info |
---|---|---|
|
Returns true if the comma separated list has an element in the literal list. |
|
|
Returns a number indicating how many items in the literal list are contained in the list defined in the first parameter. |
|
|
Returns true if the string contains all the string literals in the list. Not case-sensitive. |
|
|
Returns true if the string contains all the string literals in the list. Case-sensitive. |
|
|
Returns a number indicating how many items in the literal list are contained in the string. Also try CONTAINSANY, CSCONTAINSANY, CSCONTAINSCOUNT, LISTHASCOUNT, and CSLISTHASCOUNT. |
|
|
Retrieves the first element (i.e. the head) in a comma-separated list, with an optional separator argument. |
|
|
Second element in a comma separated list, with an optional separator argument. |
|
|
Nth element in a comma separated list, with an optional separator argument. |
|
|
Last element in a comma separated list, with an optional separator argument. |
|
|
The tail (the list minus the first element), with an optional separator argument. |
|
|
The list reversed, with an optional separator argument. |
|
|
The list sorted alphabetically in a ascending order. |
|
|
The list sorted alphabetically in a descending order. |
listsortdesc |
|
The list sorted numerically in a ascending order. |
listnumsortasc |
|
The list sorted numerically in a descending order |
listnumsortdesc |
|
A comma separated list trimmed from white-spaces. |
listtrim |
|
The distinct elements in the list, i.e. corresponding set. |
listdist |
|
The maximum element (element as string). |
listmax |
|
The minimum element (element as string). |
listmin |
|
The size of the list, with an optional separator argument. |
listsize |
|
Count frequency of elements in a list, returning the pairs (e1;count1,..,en;countn), with an optional separator argument. |
listcount |
|
The maximum element (element as number). |
listnummax |
|
The minimum element (element as number). |
listnummin |
|
The sum of the elements (element as numbers). |
listnumsum |
|
The average of the elements (element as numbers). |
listnumavg |
|
The unbiased standard deviation of the elements (element as numbers). |
listnumstd |
|
The one based index to a list of elements where a target is found. The function will search the list defined in parameter 1 for the first instance of parameter 2. Optional separator argument. |
listindex |
|
The list translated using expression provided in second argument. Element denoted with x. Optional separator argument. |
|
|
The list filtered using expression provided in second argument. Element as x, index as i. Optional separator argument. Example LISTFILTER(col,’x != 1’) or LISTFILTER(col,’i > 2’) |
|
|
Two lists zipped together, each pair of elements separated with a semicolon. Optional separator and delimter arguments. |
|
|
Filter the first list by the content of the second list, with an optional separator argument. Optional separator and delimter arguments. |
listzipfilter |
|
Returns a semi-comma-separated list of all combinations of elements in the input list of length within the interval specified by the input integers. |
|
|
Returns a list with the given item added to the end, with an optional separator argument. |
|
|
The list of equally separated values (second argument) translated using expression provided in third argument. Fourth argument is the result separator. Element denoted with x. Example FSVMAP(col,2,’x+1’,’,’). Also see LISTMAP. |
fsvmap |
|
Collapse values from multiple columns into a single list, with an optional separator argument. |
|
|
Collapse values from multiple columns into a single list with an expression applied, with optional separator argument |
Genomic-Specific Functions¶
Function |
Description |
More info |
---|---|---|
|
The Hamming-like distance between two haplotype strings. |
hapldiff |
|
Variant signature. |
varsig |
|
Reverse complement of a DNA sequence string. |
revcompl |
|
Shorthand for REVCOMPL(str). |
revcompl |
|
BAM cigar string for the corresponding reverse complement sequence. |
revcigar |
|
The reference base at the given locus, based on the build specified in the gor_config.txt file. |
refbase |
|
The reference bases, based on the build specified in the gor_config.txt file. |
refbases |
|
Extract a single substring from an attribute value TAG_VALUE-like field (as in BAM files). |
bamtag |
|
Extract a single substring from an attribute value field (as in GFF or VCF files, e.g. use semicolon ‘;’ as separator). |
tag |
|
Gets a value from a vcf file. Parameters are: <name of the format column that contains the format of the data column>, <the name of the data column that contains data for a given PN, the column name is the same as the pn>, <the tag to get value for>. |
|
|
Returns 1 if IUPAC genotype string contains SNP allele str, zero otherwise. |
iha |
|
Converts IUPAC genotype to ‘A1/A2’ genotype. |
iupac2gt |
|
Input IUPAC genotypes for subject, father and mother. Returns ‘0’ if IHE, ‘1’ if OK, and ‘2’ if OK and phase-able. |
iupacgtstat |
|
Returns the SNP allele of the father. Only valid if IUPACGTSTAT returns 2. |
iupacfa |
|
Returns the SNP allele of the mother. Only valid if IUPACGTSTAT returns 2. |
iupacma |
|
Input two (pos,Ref,Alleles) genotypes where Alleles = ‘All1,All2,..’ or ‘All1/All2/..’ or ‘All1|All2’. Returns the number of identical allels based on all pairwise comparisons between Alleles1 and Alleles2. First parameter is chr. |
gtshare |
|
Input (pos,Ref,Alt) genotypes for subject, father and mother. Returns ‘0’ if IHE, ‘1’ if OK, and ‘2’ if OK and phase-able. |
gtstat |
|
Returns the Alt allele of the father. Only valid if GTSTAT returns ‘2’. |
gtfa |
|
Returns the Alt allele of the mother. Only valid if GTSTAT returns ‘2’. |
gtma |
|
Not a standard function. Should be used as | where go_id INDAG(‘go.txt’,’GO:111111’) or INDAG([#temp#],’GO:111111’) |
indag |
Genotype-quality Functions¶
Function |
Description |
More info |
---|---|---|
|
Turn 2 chars into homozygous genotype prob |
chars2prom |
|
Turn 2 chars into heterozygous genotype prob |
chars2prhet |
|
Turn 2 chars into genotype dosage |
chars2dose |
|
Turn 2 chars, round((1.0-pr)*93.0)+33 , into genotype prob triplet (Pr(gt=0),Pr(gt=1),Pr(gt=2)). Two spaces (c32c32) map to ‘0;0;0’ (unknown). Assumes non-phased. |
chars2prprpr |
|
Turn 2 chars, round((1.0-pr)*93.0)+33 , into genotype prob doublet (Pr(gt=1),Pr(gt=2)). |
chars2prpr |
|
Turn on char into probability, e.g CHAR2PR(‘!’) = 1.0, CHAR2PR(‘~’)=0. |
chars2pr |
|
Turn probability to characters, i.e. the semi-inverse of CHAR2PR. |
pr2char |
|
Turn probability pair to chars, e.g. PRPR2CHAR(‘1.0;0.0’) = ‘!~’ |
prpr2chars |
|
Turn probability triplet to chars, e.g. PRPR2CHAR(‘1.0;0.0;0.0’) = ‘!~’ |
prprpr2chars |
|
Turn probability pair to chars, e.g. PRPR2CHAR(‘1.0;0.0’) = ‘!~’ with custom separator |
prpr2chars |
|
Turn probability triplet to chars, e.g. PRPRPR2CHAR(‘1.0;0.0’) = ‘!~’ with custom separator |
prprpr2chars |
|
Turn 2 chars, round((1.0-pr)*93.0)+33 , into genotype 0,1,2 if prob >= thresh else 3. Two spaces (c32c32) map to 3 (unknown). Assumes non-phased probabilities. |
chars2gt |
|
Turn 2 chars, round((1.0-pr)*93.0)+33 , into phased genotype 0,1,2 if prob >= thresh else 3. Two spaces (c32c32) map to 3 (unknown). Assumes probabilities of phased haplotypes. Pr(gt=0) = (1.0 - pfalt) * (1.0 - pmalt), Pr(gt=1) = (1.0 - pfalt) * pmalt + pfalt * (1.0 - pmalt), Pr(gt=2) = pfalt * pmalt, where pfalt and pmalt are the chars probabilities values. |
charsphased2gt |
Date Functions¶
Function |
Description |
More info |
---|---|---|
|
Add some days to the given date. |
|
|
Add some months to the given date. |
|
|
Add some years to the given date. |
|
|
The current date and time in the format ‘yyyy-MM-dd HH:mm:ss’ - same as DATE. |
|
|
The current date and time in a specific format - same as DATE. |
|
|
The current date and time in the format ‘yyyy-MM-dd HH:mm:ss’. |
|
|
The current date and time in a specific format, defined by a string of characters that represent date/time units. |
|
|
The difference, in days, between two dates. |
|
|
The day of week of the given date. |
|
|
The day of month of the given date. |
|
|
The day of year of the given date. |
|
|
A specific time, indicated by a timestamp, in the format ‘yyyy-MM-dd HH:mm:ss’. |
edate |
|
A specific time, indicated by a timestamp, in a specific format. The format is defined in the same way as with date(string). |
edate |
|
A timestamp of the current time. |
epoch |
|
A timestamp of a specific time, indicated with a specified format. The format is defined in the same way as with date(string) and edate(long, string). Example: epoch(‘16/06/2017’,’dd/MM/yyyy’). |
epoch |
|
The month of the given date. |
|
|
The difference, in months, between two dates. |
|
|
The year of the given date. |
|
|
The difference, in years, between two dates. |
Administration Functions¶
Diagnostic Functions¶
Function |
Description |
More info |
---|---|---|
|
The time in milli seconds since the query started. |
time |
|
Sleep for given milliseconds while processing each row |
sleep |
|
Name of the host running the query |
hostname |
|
IP number of the host running the query |
ip |
|
CPU architecture of the host running the query |
arch |
|
Thread id of the thread running the query |
threadid |
|
The cpuload of the process running the query |
cpuload |
|
The cpuload on the system running the query |
syscpuload |
|
Free physical memory on the system running the query |
free |
|
Free memory on the system running the query |
freemem |
|
Total memory on the system running the query |
totalmem |
|
Maximum memory of the process running the query |
maxmem |
|
Number of available cpus on the system |
availcpu |
|
Number of open filedescriptors on the system |
openfiles |
|
Maximum number of file descriptors |
maxfiles |
|
Returns one line from the stdout of a whitelisted system command |
system |
|
Returns the average seektime for the current rowSource in milliseconds |
avgseektimemillis |
|
Returns average rows per millisecond for the current rowSource |
avgrowspermillis` |
|
Returns average bases per millisecond for the current rowSource |
avgbasespermillis |