27 February, 2017

Tableau Functions

Functions

The calculation functions are grouped into categories. These are the same categories used in the calculation editor. The aggregate functions such as sum, average, and so on are described in Aggregations.
For information on calculations, see Calculated Fields.

Details(Officials)

Tableau Data Types

Data Types

Tableau supports string, date/datetime, number, and boolean data types. These data types are automatically handled in the proper fashion. However, if you create calculated fields of your own, you need to be aware of how to use and combine the different data types in formulas. For example, you cannot add a string to a number. Also, many functions that are available to you when you define a calculation only work when they are applied to specific data types. For example, the DATEPART() function can accept only a date/datetime data type as an argument. So, you can write DATEPART('year',#April 15,2004#) and expect a valid result: 2004. You cannot write DATEPART('year',"Tom Sawyer") and expect a valid result. In fact, this example returns an error because "Tom Sawyer" is a string, not a date/datetime.
Although Tableau will attempt to fully validate all calculations, some data type errors cannot be found until the query is run against the database. These issues appear as error dialogs at the time of the query rather than in the calculation dialog box.
The data types supported by Tableau are described below. Refer to Type Conversion to learn about converting from one data type to another.

STRING

A sequence of zero or more characters. For example, "Wisconsin", "ID-44400", and "Tom Sawyer" are all strings. Strings are recognized by single or double quotes. The quote character itself can be included in a string by repeating it. For example, ‘O''Hanrahan’.

DATE/DATETIME

A date or a datetime. For example "January 23, 1972" or "January 23, 1972 12:32:00 AM". If you would like a date written in long-hand style to be interpreted as a a date/datetime, place the # sign on either side of it. For instance, “January 23, 1972” is treated as a string data type but #January 23, 1972# is treated as a date/datetime data type.

NUMBER

Numerical values in Tableau can be either integers or floating-point numbers.
With floating-point numbers, results of some aggregations may not always be exactly as expected. For example, you may find that the SUM function returns a value such as -1.42e-14 for a column of numbers that you know should sum to exactly 0. This happens because the Institute of Electrical and Electronics Engineers (IEEE) 754 floating-point standard requires that numbers be stored in binary format, which means that numbers are sometimes rounded at extremely fine levels of precision. You can eliminate this potential distraction by using the ROUND function (see Number Functions or by formatting the number to show fewer decimal places.
Operations that test floating point values for equality can behave unpredictably for the same reason. Such comparisons can occur when using level of detail expressions as dimensions, in categorical filtering, creating ad-hoc groups, creating IN/OUT sets, and with data blending.
Note: The largest signed 64-bit integer is 9,223,372,036,854,775,807. When connecting to a new data source, any column with data type set to Number (Whole), can accommodate values up to this limit; for larger values, Tableau will use floating point.

BOOLEAN

A field that contains the values TRUE or FALSE. An unknown value arises when the result of a comparison is unknown. For example, the expression 7 > Null yields unknown. Unknown booleans are automatically converted to Null.

Details(Officials)

09 January, 2017

TESSERACT Manual


NAME
tesseract - command line OCR engine 

SYNOPSIS
tesseract imagename|stdin outputbase|stdout [options…​] [configfile…​]

DESCRIPTION
tesseract(1) is a commercial quality OCR engine originally developed at HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005, and has been developed at Google since then.

IN/OUT ARGUMENTS
imagename
The name of the input image. Most image file formats (anything readable by Leptonica) are supported.
stdin
Instruction to read data from standard input
outputbase
The basename of the output file (to which the appropriate extension will be appended). By default the output will be named outbase.txt.
stdout
Instruction to sent output data to standard output

OPTIONS
--tessdata-dir /path
Specify the location of tessdata path
--user-words /path/to/file
Specify the location of user words file
--user-patterns /path/to/file specify
The location of user patterns file
-c configvar=value
Set value for control parameter. Multiple -c arguments are allowed.
-l lang
The language to use. If none is specified, English is assumed. Multiple languages may be specified, separated by plus characters. Tesseract uses 3-character ISO 639-2 language codes. (See LANGUAGES)
--psm N
Set Tesseract to only run a subset of layout analysis and assume a certain form of image. The options for N are:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR.
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
configfile
The name of a config to use. A config is a plaintext file which contains a list of variables and their values, one per line, with a space separating variable from value. Interesting config files include:
·         hocr - Output in hOCR format instead of as a text file.
·         pdf - Output in pdf instead of a text file.
Nota Bene: The options -l lang and --psm N must occur before any configfile.

SINGLE OPTIONS
-v
Returns the current version of the tesseract(1) executable.
--list-langs
list available languages for tesseract engine. Can be used with --tessdata-dir.
--print-parameters
print tesseract parameters to the stdout.
TESSERACT MAnual details