Table of Contents
- 12.1 SQL Function and Operator Reference
- 12.2 User-Defined Function Reference
- 12.3 Type Conversion in Expression Evaluation
- 12.4 Operators
- 12.5 Flow Control Functions
- 12.6 Numeric Functions and Operators
- 12.7 Date and Time Functions
- 12.8 String Functions and Operators
- 12.9 What Calendar Is Used By MySQL?
- 12.10 Full-Text Search Functions
- 12.10.1 Natural Language Full-Text Searches
- 12.10.2 Boolean Full-Text Searches
- 12.10.3 Full-Text Searches with Query Expansion
- 12.10.4 Full-Text Stopwords
- 12.10.5 Full-Text Restrictions
- 12.10.6 Fine-Tuning MySQL Full-Text Search
- 12.10.7 Adding a User-Defined Collation for Full-Text Indexing
- 12.10.8 ngram Full-Text Parser
- 12.10.9 MeCab Full-Text Parser Plugin
- 12.11 Cast Functions and Operators
- 12.12 XML Functions
- 12.13 Bit Functions and Operators
- 12.14 Encryption and Compression Functions
- 12.15 Locking Functions
- 12.16 Information Functions
- 12.17 Spatial Analysis Functions
- 12.17.1 Spatial Function Reference
- 12.17.2 Argument Handling by Spatial Functions
- 12.17.3 Functions That Create Geometry Values from WKT Values
- 12.17.4 Functions That Create Geometry Values from WKB Values
- 12.17.5 MySQL-Specific Functions That Create Geometry Values
- 12.17.6 Geometry Format Conversion Functions
- 12.17.7 Geometry Property Functions
- 12.17.8 Spatial Operator Functions
- 12.17.9 Functions That Test Spatial Relations Between Geometry Objects
- 12.17.10 Spatial Geohash Functions
- 12.17.11 Spatial GeoJSON Functions
- 12.17.12 Spatial Convenience Functions
- 12.18 JSON Functions
- 12.18.1 JSON Function Reference
- 12.18.2 Functions That Create JSON Values
- 12.18.3 Functions That Search JSON Values
- 12.18.4 Functions That Modify JSON Values
- 12.18.5 Functions That Return JSON Value Attributes
- 12.18.6 JSON Table Functions
- 12.18.7 JSON Schema Validation Functions
- 12.18.8 JSON Utility Functions
- 12.19 Functions Used with Global Transaction Identifiers (GTIDs)
- 12.20 Aggregate Functions
- 12.21 Window Functions
- 12.22 Performance Schema Functions
- 12.23 Internal Functions
- 12.24 Miscellaneous Functions
- 12.25 Precision Math
Expressions can be used at several points in
SQL statements, such as in the
ORDER BY
or HAVING
clauses of
SELECT
statements, in the
WHERE
clause of a
SELECT
,
DELETE
, or
UPDATE
statement, or in
SET
statements. Expressions can be written using literal values, column
values, NULL
, built-in functions, stored
functions, user-defined functions, and operators. This chapter
describes the SQL functions and operators that are permitted for
writing expressions in MySQL. Instructions for writing stored
functions and user-defined functions are given in
Section 25.2, “Using Stored Routines”, and
Adding Functions to MySQL. See
Section 9.2.5, “Function Name Parsing and Resolution”, for the rules describing how
the server interprets references to different kinds of functions.
An expression that contains NULL
always produces
a NULL
value unless otherwise indicated in the
documentation for a particular function or operator.
By default, there must be no whitespace between a function name and the parenthesis following it. This helps the MySQL parser distinguish between function calls and references to tables or columns that happen to have the same name as a function. However, spaces around function arguments are permitted.
You can tell the MySQL server to accept spaces after function names
by starting it with the
--sql-mode=IGNORE_SPACE
option. (See
Section 5.1.11, “Server SQL Modes”.) Individual client programs can request
this behavior by using the CLIENT_IGNORE_SPACE
option for mysql_real_connect()
. In
either case, all function names become reserved words.
For the sake of brevity, most examples in this chapter display the output from the mysql program in abbreviated form. Rather than showing examples in this format:
mysql> SELECT MOD(29,9);
+-----------+
| mod(29,9) |
+-----------+
| 2 |
+-----------+
1 rows in set (0.00 sec)
This format is used instead:
mysql> SELECT MOD(29,9);
-> 2
The following table lists each SQL function and operator and provides a short description of each one. For a table listing user-defined functions, see Section 12.2, “User-Defined Function Reference”.
Table 12.1 SQL Functions and Operators
Name | Description |
---|---|
& |
Bitwise AND |
> |
Greater than operator |
>> |
Right shift |
>= |
Greater than or equal operator |
< |
Less than operator |
<> , != |
Not equal operator |
<< |
Left shift |
<= |
Less than or equal operator |
<=> |
NULL-safe equal to operator |
% , MOD |
Modulo operator |
* |
Multiplication operator |
+ |
Addition operator |
- |
Minus operator |
- |
Change the sign of the argument |
-> |
Return value from JSON column after evaluating path; equivalent to JSON_EXTRACT(). |
->> |
Return value from JSON column after evaluating path and unquoting the result; equivalent to JSON_UNQUOTE(JSON_EXTRACT()). |
/ |
Division operator |
:= |
Assign a value |
= |
Assign a value (as part of a
SET
statement, or as part of the SET clause in an
UPDATE statement)
|
= |
Equal operator |
^ |
Bitwise XOR |
ABS() |
Return the absolute value |
ACOS() |
Return the arc cosine |
ADDDATE() |
Add time values (intervals) to a date value |
ADDTIME() |
Add time |
AES_DECRYPT() |
Decrypt using AES |
AES_ENCRYPT() |
Encrypt using AES |
AND , && |
Logical AND |
ANY_VALUE() |
Suppress ONLY_FULL_GROUP_BY value rejection |
ASCII() |
Return numeric value of left-most character |
ASIN() |
Return the arc sine |
ATAN() |
Return the arc tangent |
ATAN2() , ATAN() |
Return the arc tangent of the two arguments |
AVG() |
Return the average value of the argument |
BENCHMARK() |
Repeatedly execute an expression |
BETWEEN ... AND ... |
Whether a value is within a range of values |
BIN() |
Return a string containing binary representation of a number |
BIN_TO_UUID() |
Convert binary UUID to string |
BINARY |
Cast a string to a binary string |
BIT_AND() |
Return bitwise AND |
BIT_COUNT() |
Return the number of bits that are set |
BIT_LENGTH() |
Return length of argument in bits |
BIT_OR() |
Return bitwise OR |
BIT_XOR() |
Return bitwise XOR |
CAN_ACCESS_COLUMN() |
Internal use only |
CAN_ACCESS_DATABASE() |
Internal use only |
CAN_ACCESS_TABLE() |
Internal use only |
CAN_ACCESS_USER() (introduced 8.0.22) |
Internal use only |
CAN_ACCESS_VIEW() |
Internal use only |
CASE |
Case operator |
CAST() |
Cast a value as a certain type |
CEIL() |
Return the smallest integer value not less than the argument |
CEILING() |
Return the smallest integer value not less than the argument |
CHAR() |
Return the character for each integer passed |
CHAR_LENGTH() |
Return number of characters in argument |
CHARACTER_LENGTH() |
Synonym for CHAR_LENGTH() |
CHARSET() |
Return the character set of the argument |
COALESCE() |
Return the first non-NULL argument |
COERCIBILITY() |
Return the collation coercibility value of the string argument |
COLLATION() |
Return the collation of the string argument |
COMPRESS() |
Return result as a binary string |
CONCAT() |
Return concatenated string |
CONCAT_WS() |
Return concatenate with separator |
CONNECTION_ID() |
Return the connection ID (thread ID) for the connection |
CONV() |
Convert numbers between different number bases |
CONVERT() |
Cast a value as a certain type |
CONVERT_TZ() |
Convert from one time zone to another |
COS() |
Return the cosine |
COT() |
Return the cotangent |
COUNT() |
Return a count of the number of rows returned |
COUNT(DISTINCT) |
Return the count of a number of different values |
CRC32() |
Compute a cyclic redundancy check value |
CUME_DIST() |
Cumulative distribution value |
CURDATE() |
Return the current date |
CURRENT_DATE() , CURRENT_DATE |
Synonyms for CURDATE() |
CURRENT_ROLE() |
Return the current active roles |
CURRENT_TIME() , CURRENT_TIME |
Synonyms for CURTIME() |
CURRENT_TIMESTAMP() , CURRENT_TIMESTAMP |
Synonyms for NOW() |
CURRENT_USER() , CURRENT_USER |
The authenticated user name and host name |
CURTIME() |
Return the current time |
DATABASE() |
Return the default (current) database name |
DATE() |
Extract the date part of a date or datetime expression |
DATE_ADD() |
Add time values (intervals) to a date value |
DATE_FORMAT() |
Format date as specified |
DATE_SUB() |
Subtract a time value (interval) from a date |
DATEDIFF() |
Subtract two dates |
DAY() |
Synonym for DAYOFMONTH() |
DAYNAME() |
Return the name of the weekday |
DAYOFMONTH() |
Return the day of the month (0-31) |
DAYOFWEEK() |
Return the weekday index of the argument |
DAYOFYEAR() |
Return the day of the year (1-366) |
DEFAULT() |
Return the default value for a table column |
DEGREES() |
Convert radians to degrees |
DENSE_RANK() |
Rank of current row within its partition, without gaps |
DIV |
Integer division |
ELT() |
Return string at index number |
EXP() |
Raise to the power of |
EXPORT_SET() |
Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string |
EXTRACT() |
Extract part of a date |
ExtractValue() |
Extract a value from an XML string using XPath notation |
FIELD() |
Index (position) of first argument in subsequent arguments |
FIND_IN_SET() |
Index (position) of first argument within second argument |
FIRST_VALUE() |
Value of argument from first row of window frame |
FLOOR() |
Return the largest integer value not greater than the argument |
FORMAT() |
Return a number formatted to specified number of decimal places |
FORMAT_BYTES() (introduced 8.0.16) |
Convert byte count to value with units |
FORMAT_PICO_TIME() (introduced 8.0.16) |
Convert time in picoseconds to value with units |
FOUND_ROWS() |
For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause |
FROM_BASE64() |
Decode base64 encoded string and return result |
FROM_DAYS() |
Convert a day number to a date |
FROM_UNIXTIME() |
Format Unix timestamp as a date |
GeomCollection() |
Construct geometry collection from geometries |
GeometryCollection() |
Construct geometry collection from geometries |
GET_DD_COLUMN_PRIVILEGES() |
Internal use only |
GET_DD_CREATE_OPTIONS() |
Internal use only |
GET_DD_INDEX_SUB_PART_LENGTH() |
Internal use only |
GET_FORMAT() |
Return a date format string |
GET_LOCK() |
Get a named lock |
GREATEST() |
Return the largest argument |
GROUP_CONCAT() |
Return a concatenated string |
GROUPING() |
Distinguish super-aggregate ROLLUP rows from regular rows |
GTID_SUBSET() |
Return true if all GTIDs in subset are also in set; otherwise false. |
GTID_SUBTRACT() |
Return all GTIDs in set that are not in subset. |
HEX() |
Hexadecimal representation of decimal or string value |
HOUR() |
Extract the hour |
ICU_VERSION() |
ICU library version |
IF() |
If/else construct |
IFNULL() |
Null if/else construct |
IN() |
Whether a value is within a set of values |
INET_ATON() |
Return the numeric value of an IP address |
INET_NTOA() |
Return the IP address from a numeric value |
INET6_ATON() |
Return the numeric value of an IPv6 address |
INET6_NTOA() |
Return the IPv6 address from a numeric value |
INSERT() |
Insert substring at specified position up to specified number of characters |
INSTR() |
Return the index of the first occurrence of substring |
INTERNAL_AUTO_INCREMENT() |
Internal use only |
INTERNAL_AVG_ROW_LENGTH() |
Internal use only |
INTERNAL_CHECK_TIME() |
Internal use only |
INTERNAL_CHECKSUM() |
Internal use only |
INTERNAL_DATA_FREE() |
Internal use only |
INTERNAL_DATA_LENGTH() |
Internal use only |
INTERNAL_DD_CHAR_LENGTH() |
Internal use only |
INTERNAL_GET_COMMENT_OR_ERROR() |
Internal use only |
INTERNAL_GET_ENABLED_ROLE_JSON() (introduced 8.0.19) |
Internal use only |
INTERNAL_GET_HOSTNAME() (introduced 8.0.19) |
Internal use only |
INTERNAL_GET_USERNAME() (introduced 8.0.19) |
Internal use only |
INTERNAL_GET_VIEW_WARNING_OR_ERROR() |
Internal use only |
INTERNAL_INDEX_COLUMN_CARDINALITY() |
Internal use only |
INTERNAL_INDEX_LENGTH() |
Internal use only |
INTERNAL_IS_ENABLED_ROLE() (introduced 8.0.19) |
Internal use only |
INTERNAL_IS_MANDATORY_ROLE() (introduced 8.0.19) |
Internal use only |
INTERNAL_KEYS_DISABLED() |
Internal use only |
INTERNAL_MAX_DATA_LENGTH() |
Internal use only |
INTERNAL_TABLE_ROWS() |
Internal use only |
INTERNAL_UPDATE_TIME() |
Internal use only |
INTERVAL() |
Return the index of the argument that is less than the first argument |
IS |
Test a value against a boolean |
IS_FREE_LOCK() |
Whether the named lock is free |
IS_IPV4() |
Whether argument is an IPv4 address |
IS_IPV4_COMPAT() |
Whether argument is an IPv4-compatible address |
IS_IPV4_MAPPED() |
Whether argument is an IPv4-mapped address |
IS_IPV6() |
Whether argument is an IPv6 address |
IS NOT |
Test a value against a boolean |
IS NOT NULL |
NOT NULL value test |
IS NULL |
NULL value test |
IS_USED_LOCK() |
Whether the named lock is in use; return connection identifier if true |
IS_UUID() |
Whether argument is a valid UUID |
ISNULL() |
Test whether the argument is NULL |
JSON_ARRAY() |
Create JSON array |
JSON_ARRAY_APPEND() |
Append data to JSON document |
JSON_ARRAY_INSERT() |
Insert into JSON array |
JSON_ARRAYAGG() |
Return result set as a single JSON array |
JSON_CONTAINS() |
Whether JSON document contains specific object at path |
JSON_CONTAINS_PATH() |
Whether JSON document contains any data at path |
JSON_DEPTH() |
Maximum depth of JSON document |
JSON_EXTRACT() |
Return data from JSON document |
JSON_INSERT() |
Insert data into JSON document |
JSON_KEYS() |
Array of keys from JSON document |
JSON_LENGTH() |
Number of elements in JSON document |
JSON_MERGE() (deprecated) |
Merge JSON documents, preserving duplicate keys. Deprecated synonym for JSON_MERGE_PRESERVE() |
JSON_MERGE_PATCH() |
Merge JSON documents, replacing values of duplicate keys |
JSON_MERGE_PRESERVE() |
Merge JSON documents, preserving duplicate keys |
JSON_OBJECT() |
Create JSON object |
JSON_OBJECTAGG() |
Return result set as a single JSON object |
JSON_OVERLAPS() (introduced 8.0.17) |
Compares two JSON documents, returns TRUE (1) if these have any key-value pairs or array elements in common, otherwise FALSE (0) |
JSON_PRETTY() |
Print a JSON document in human-readable format |
JSON_QUOTE() |
Quote JSON document |
JSON_REMOVE() |
Remove data from JSON document |
JSON_REPLACE() |
Replace values in JSON document |
JSON_SCHEMA_VALID() (introduced 8.0.17) |
Validate JSON document against JSON schema; returns TRUE/1 if document validates against schema, or FALSE/0 if it does not |
JSON_SCHEMA_VALIDATION_REPORT() (introduced 8.0.17) |
Validate JSON document against JSON schema; returns report in JSON format on outcome on validation including success or failure and reasons for failure |
JSON_SEARCH() |
Path to value within JSON document |
JSON_SET() |
Insert data into JSON document |
JSON_STORAGE_FREE() |
Freed space within binary representation of JSON column value following partial update |
JSON_STORAGE_SIZE() |
Space used for storage of binary representation of a JSON document |
JSON_TABLE() |
Return data from a JSON expression as a relational table |
JSON_TYPE() |
Type of JSON value |
JSON_UNQUOTE() |
Unquote JSON value |
JSON_VALID() |
Whether JSON value is valid |
JSON_VALUE() (introduced 8.0.21) |
Extract value from JSON document at location pointed to by path provided; return this value as VARCHAR(512) or specified type |
LAG() |
Value of argument from row lagging current row within partition |
LAST_DAY |
Return the last day of the month for the argument |
LAST_INSERT_ID() |
Value of the AUTOINCREMENT column for the last INSERT |
LAST_VALUE() |
Value of argument from last row of window frame |
LCASE() |
Synonym for LOWER() |
LEAD() |
Value of argument from row leading current row within partition |
LEAST() |
Return the smallest argument |
LEFT() |
Return the leftmost number of characters as specified |
LENGTH() |
Return the length of a string in bytes |
LIKE |
Simple pattern matching |
LineString() |
Construct LineString from Point values |
LN() |
Return the natural logarithm of the argument |
LOAD_FILE() |
Load the named file |
LOCALTIME() , LOCALTIME |
Synonym for NOW() |
LOCALTIMESTAMP , LOCALTIMESTAMP() |
Synonym for NOW() |
LOCATE() |
Return the position of the first occurrence of substring |
LOG() |
Return the natural logarithm of the first argument |
LOG10() |
Return the base-10 logarithm of the argument |
LOG2() |
Return the base-2 logarithm of the argument |
LOWER() |
Return the argument in lowercase |
LPAD() |
Return the string argument, left-padded with the specified string |
LTRIM() |
Remove leading spaces |
MAKE_SET() |
Return a set of comma-separated strings that have the corresponding bit in bits set |
MAKEDATE() |
Create a date from the year and day of year |
MAKETIME() |
Create time from hour, minute, second |
MASTER_POS_WAIT() |
Block until the replica has read and applied all updates up to the specified position |
MATCH |
Perform full-text search |
MAX() |
Return the maximum value |
MBRContains() |
Whether MBR of one geometry contains MBR of another |
MBRCoveredBy() |
Whether one MBR is covered by another |
MBRCovers() |
Whether one MBR covers another |
MBRDisjoint() |
Whether MBRs of two geometries are disjoint |
MBREquals() |
Whether MBRs of two geometries are equal |
MBRIntersects() |
Whether MBRs of two geometries intersect |
MBROverlaps() |
Whether MBRs of two geometries overlap |
MBRTouches() |
Whether MBRs of two geometries touch |
MBRWithin() |
Whether MBR of one geometry is within MBR of another |
MD5() |
Calculate MD5 checksum |
MEMBER OF() (introduced 8.0.17) |
Returns true (1) if first operand matches any element of JSON array passed as second operand, otherwise returns false (0) |
MICROSECOND() |
Return the microseconds from argument |
MID() |
Return a substring starting from the specified position |
MIN() |
Return the minimum value |
MINUTE() |
Return the minute from the argument |
MOD() |
Return the remainder |
MONTH() |
Return the month from the date passed |
MONTHNAME() |
Return the name of the month |
MultiLineString() |
Contruct MultiLineString from LineString values |
MultiPoint() |
Construct MultiPoint from Point values |
MultiPolygon() |
Construct MultiPolygon from Polygon values |
NAME_CONST() |
Cause the column to have the given name |
NOT , ! |
Negates value |
NOT BETWEEN ... AND ... |
Whether a value is not within a range of values |
NOT IN() |
Whether a value is not within a set of values |
NOT LIKE |
Negation of simple pattern matching |
NOT REGEXP |
Negation of REGEXP |
NOW() |
Return the current date and time |
NTH_VALUE() |
Value of argument from N-th row of window frame |
NTILE() |
Bucket number of current row within its partition. |
NULLIF() |
Return NULL if expr1 = expr2 |
OCT() |
Return a string containing octal representation of a number |
OCTET_LENGTH() |
Synonym for LENGTH() |
OR , || |
Logical OR |
ORD() |
Return character code for leftmost character of the argument |
PERCENT_RANK() |
Percentage rank value |
PERIOD_ADD() |
Add a period to a year-month |
PERIOD_DIFF() |
Return the number of months between periods |
PI() |
Return the value of pi |
Point() |
Construct Point from coordinates |
Polygon() |
Construct Polygon from LineString arguments |
POSITION() |
Synonym for LOCATE() |
POW() |
Return the argument raised to the specified power |
POWER() |
Return the argument raised to the specified power |
PS_CURRENT_THREAD_ID() (introduced 8.0.16) |
Performance Schema thread ID for current thread |
PS_THREAD_ID() (introduced 8.0.16) |
Performance Schema thread ID for given thread |
QUARTER() |
Return the quarter from a date argument |
QUOTE() |
Escape the argument for use in an SQL statement |
RADIANS() |
Return argument converted to radians |
RAND() |
Return a random floating-point value |
RANDOM_BYTES() |
Return a random byte vector |
RANK() |
Rank of current row within its partition, with gaps |
REGEXP |
Whether string matches regular expression |
REGEXP_INSTR() |
Starting index of substring matching regular expression |
REGEXP_LIKE() |
Whether string matches regular expression |
REGEXP_REPLACE() |
Replace substrings matching regular expression |
REGEXP_SUBSTR() |
Return substring matching regular expression |
RELEASE_ALL_LOCKS() |
Release all current named locks |
RELEASE_LOCK() |
Release the named lock |
REPEAT() |
Repeat a string the specified number of times |
REPLACE() |
Replace occurrences of a specified string |
REVERSE() |
Reverse the characters in a string |
RIGHT() |
Return the specified rightmost number of characters |
RLIKE |
Whether string matches regular expression |
ROLES_GRAPHML() |
Return a GraphML document representing memory role subgraphs |
ROUND() |
Round the argument |
ROW_COUNT() |
The number of rows updated |
ROW_NUMBER() |
Number of current row within its partition |
RPAD() |
Append string the specified number of times |
RTRIM() |
Remove trailing spaces |
SCHEMA() |
Synonym for DATABASE() |
SEC_TO_TIME() |
Converts seconds to 'hh:mm:ss' format |
SECOND() |
Return the second (0-59) |
SESSION_USER() |
Synonym for USER() |
SHA1() , SHA() |
Calculate an SHA-1 160-bit checksum |
SHA2() |
Calculate an SHA-2 checksum |
SIGN() |
Return the sign of the argument |
SIN() |
Return the sine of the argument |
SLEEP() |
Sleep for a number of seconds |
SOUNDEX() |
Return a soundex string |
SOUNDS LIKE |
Compare sounds |
SPACE() |
Return a string of the specified number of spaces |
SQRT() |
Return the square root of the argument |
ST_Area() |
Return Polygon or MultiPolygon area |
ST_AsBinary() , ST_AsWKB() |
Convert from internal geometry format to WKB |
ST_AsGeoJSON() |
Generate GeoJSON object from geometry |
ST_AsText() , ST_AsWKT() |
Convert from internal geometry format to WKT |
ST_Buffer() |
Return geometry of points within given distance from geometry |
ST_Buffer_Strategy() |
Produce strategy option for ST_Buffer() |
ST_Centroid() |
Return centroid as a point |
ST_Contains() |
Whether one geometry contains another |
ST_ConvexHull() |
Return convex hull of geometry |
ST_Crosses() |
Whether one geometry crosses another |
ST_Difference() |
Return point set difference of two geometries |
ST_Dimension() |
Dimension of geometry |
ST_Disjoint() |
Whether one geometry is disjoint from another |
ST_Distance() |
The distance of one geometry from another |
ST_Distance_Sphere() |
Minimum distance on earth between two geometries |
ST_EndPoint() |
End Point of LineString |
ST_Envelope() |
Return MBR of geometry |
ST_Equals() |
Whether one geometry is equal to another |
ST_ExteriorRing() |
Return exterior ring of Polygon |
ST_FrechetDistance() (introduced 8.0.23) |
The discrete Fréchet distance of one geometry from another |
ST_GeoHash() |
Produce a geohash value |
ST_GeomCollFromText() , ST_GeometryCollectionFromText() , ST_GeomCollFromTxt() |
Return geometry collection from WKT |
ST_GeomCollFromWKB() , ST_GeometryCollectionFromWKB() |
Return geometry collection from WKB |
ST_GeometryN() |
Return N-th geometry from geometry collection |
ST_GeometryType() |
Return name of geometry type |
ST_GeomFromGeoJSON() |
Generate geometry from GeoJSON object |
ST_GeomFromText() , ST_GeometryFromText() |
Return geometry from WKT |
ST_GeomFromWKB() , ST_GeometryFromWKB() |
Return geometry from WKB |
ST_HausdorffDistance() (introduced 8.0.23) |
The discrete Hausdorff distance of one geometry from another |
ST_InteriorRingN() |
Return N-th interior ring of Polygon |
ST_Intersection() |
Return point set intersection of two geometries |
ST_Intersects() |
Whether one geometry intersects another |
ST_IsClosed() |
Whether a geometry is closed and simple |
ST_IsEmpty() |
Whether a geometry is empty |
ST_IsSimple() |
Whether a geometry is simple |
ST_IsValid() |
Whether a geometry is valid |
ST_LatFromGeoHash() |
Return latitude from geohash value |
ST_Latitude() (introduced 8.0.12) |
Return latitude of Point |
ST_Length() |
Return length of LineString |
ST_LineFromText() , ST_LineStringFromText() |
Construct LineString from WKT |
ST_LineFromWKB() , ST_LineStringFromWKB() |
Construct LineString from WKB |
ST_LineInterpolatePoint() (introduced 8.0.24) |
The point a given percentage along a LineString |
ST_LineInterpolatePoints() (introduced 8.0.24) |
The points a given percentage along a LineString |
ST_LongFromGeoHash() |
Return longitude from geohash value |
ST_Longitude() (introduced 8.0.12) |
Return longitude of Point |
ST_MakeEnvelope() |
Rectangle around two points |
ST_MLineFromText() , ST_MultiLineStringFromText() |
Construct MultiLineString from WKT |
ST_MLineFromWKB() , ST_MultiLineStringFromWKB() |
Construct MultiLineString from WKB |
ST_MPointFromText() , ST_MultiPointFromText() |
Construct MultiPoint from WKT |
ST_MPointFromWKB() , ST_MultiPointFromWKB() |
Construct MultiPoint from WKB |
ST_MPolyFromText() , ST_MultiPolygonFromText() |
Construct MultiPolygon from WKT |
ST_MPolyFromWKB() , ST_MultiPolygonFromWKB() |
Construct MultiPolygon from WKB |
ST_NumGeometries() |
Return number of geometries in geometry collection |
ST_NumInteriorRing() , ST_NumInteriorRings() |
Return number of interior rings in Polygon |
ST_NumPoints() |
Return number of points in LineString |
ST_Overlaps() |
Whether one geometry overlaps another |
ST_PointAtDistance() (introduced 8.0.24) |
The point a given distance along a LineString |
ST_PointFromGeoHash() |
Convert geohash value to POINT value |
ST_PointFromText() |
Construct Point from WKT |
ST_PointFromWKB() |
Construct Point from WKB |
ST_PointN() |
Return N-th point from LineString |
ST_PolyFromText() , ST_PolygonFromText() |
Construct Polygon from WKT |
ST_PolyFromWKB() , ST_PolygonFromWKB() |
Construct Polygon from WKB |
ST_Simplify() |
Return simplified geometry |
ST_SRID() |
Return spatial reference system ID for geometry |
ST_StartPoint() |
Start Point of LineString |
ST_SwapXY() |
Return argument with X/Y coordinates swapped |
ST_SymDifference() |
Return point set symmetric difference of two geometries |
ST_Touches() |
Whether one geometry touches another |
ST_Transform() (introduced 8.0.13) |
Transform coordinates of geometry |
ST_Union() |
Return point set union of two geometries |
ST_Validate() |
Return validated geometry |
ST_Within() |
Whether one geometry is within another |
ST_X() |
Return X coordinate of Point |
ST_Y() |
Return Y coordinate of Point |
STATEMENT_DIGEST() |
Compute statement digest hash value |
STATEMENT_DIGEST_TEXT() |
Compute normalized statement digest |
STD() |
Return the population standard deviation |
STDDEV() |
Return the population standard deviation |
STDDEV_POP() |
Return the population standard deviation |
STDDEV_SAMP() |
Return the sample standard deviation |
STR_TO_DATE() |
Convert a string to a date |
STRCMP() |
Compare two strings |
SUBDATE() |
Synonym for DATE_SUB() when invoked with three arguments |
SUBSTR() |
Return the substring as specified |
SUBSTRING() |
Return the substring as specified |
SUBSTRING_INDEX() |
Return a substring from a string before the specified number of occurrences of the delimiter |
SUBTIME() |
Subtract times |
SUM() |
Return the sum |
SYSDATE() |
Return the time at which the function executes |
SYSTEM_USER() |
Synonym for USER() |
TAN() |
Return the tangent of the argument |
TIME() |
Extract the time portion of the expression passed |
TIME_FORMAT() |
Format as time |
TIME_TO_SEC() |
Return the argument converted to seconds |
TIMEDIFF() |
Subtract time |
TIMESTAMP() |
With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments |
TIMESTAMPADD() |
Add an interval to a datetime expression |
TIMESTAMPDIFF() |
Subtract an interval from a datetime expression |
TO_BASE64() |
Return the argument converted to a base-64 string |
TO_DAYS() |
Return the date argument converted to days |
TO_SECONDS() |
Return the date or datetime argument converted to seconds since Year 0 |
TRIM() |
Remove leading and trailing spaces |
TRUNCATE() |
Truncate to specified number of decimal places |
UCASE() |
Synonym for UPPER() |
UNCOMPRESS() |
Uncompress a string compressed |
UNCOMPRESSED_LENGTH() |
Return the length of a string before compression |
UNHEX() |
Return a string containing hex representation of a number |
UNIX_TIMESTAMP() |
Return a Unix timestamp |
UpdateXML() |
Return replaced XML fragment |
UPPER() |
Convert to uppercase |
USER() |
The user name and host name provided by the client |
UTC_DATE() |
Return the current UTC date |
UTC_TIME() |
Return the current UTC time |
UTC_TIMESTAMP() |
Return the current UTC date and time |
UUID() |
Return a Universal Unique Identifier (UUID) |
UUID_SHORT() |
Return an integer-valued universal identifier |
UUID_TO_BIN() |
Convert string UUID to binary |
VALIDATE_PASSWORD_STRENGTH() |
Determine strength of password |
VALUES() |
Define the values to be used during an INSERT |
VAR_POP() |
Return the population standard variance |
VAR_SAMP() |
Return the sample variance |
VARIANCE() |
Return the population standard variance |
VERSION() |
Return a string that indicates the MySQL server version |
WAIT_FOR_EXECUTED_GTID_SET() |
Wait until the given GTIDs have executed on the replica. |
WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS() (deprecated 8.0.18) |
Use WAIT_FOR_EXECUTED_GTID_SET() .
|
WEEK() |
Return the week number |
WEEKDAY() |
Return the weekday index |
WEEKOFYEAR() |
Return the calendar week of the date (1-53) |
WEIGHT_STRING() |
Return the weight string for a string |
XOR |
Logical XOR |
YEAR() |
Return the year |
YEARWEEK() |
Return the year and week |
| |
Bitwise OR |
~ |
Bitwise inversion |
The following table lists each user-defined function and provides a short description of each one. For a table listing SQL functions and operators, see Section 12.1, “SQL Function and Operator Reference”
For general information about user-defined functions, see Section 5.7, “MySQL Server User-Defined Functions”.
Table 12.2 User-Defined Functions
Name | Description |
---|---|
asymmetric_decrypt() |
Decrypt ciphertext using private or public key |
asymmetric_derive() |
Derive symmetric key from asymmetric keys |
asymmetric_encrypt() |
Encrypt cleartext using private or public key |
asymmetric_sign() |
Generate signature from digest |
asymmetric_verify() |
Verify that signature matches digest |
asynchronous_connection_failover_add_managed() (introduced 8.0.23) |
Add a replication source server in a managed group to the source list |
asynchronous_connection_failover_add_source() (introduced 8.0.22) |
Add a replication source server to the source list |
asynchronous_connection_failover_delete_managed() (introduced 8.0.23) |
Remove managed group of replication source servers from the source list |
asynchronous_connection_failover_delete_source() (introduced 8.0.22) |
Remove a replication source server from the source list |
audit_api_message_emit_udf() |
Add message event to audit log |
audit_log_encryption_password_get() |
Fetch audit log encryption password |
audit_log_encryption_password_set() |
Set audit log encryption password |
audit_log_filter_flush() |
Flush audit log filter tables |
audit_log_filter_remove_filter() |
Remove audit log filter |
audit_log_filter_remove_user() |
Unassign audit log filter from user |
audit_log_filter_set_filter() |
Define audit log filter |
audit_log_filter_set_user() |
Assign audit log filter to user |
audit_log_read() |
Return audit log records |
audit_log_read_bookmark() |
Bookmark for most recent audit log event |
create_asymmetric_priv_key() |
Create private key |
create_asymmetric_pub_key() |
Create public key |
create_dh_parameters() |
Generate shared DH secret |
create_digest() |
Generate digest from string |
gen_blacklist() |
Perform dictionary term replacement |
gen_dictionary() |
Return random term from dictionary |
gen_dictionary_drop() |
Remove dictionary from registry |
gen_dictionary_load() |
Load dictionary into registry |
gen_range() |
Generate random number within range |
gen_rnd_email() |
Generate random email address |
gen_rnd_pan() |
Generate random payment card Primary Account Number |
gen_rnd_ssn() |
Generate random US Social Security number |
gen_rnd_us_phone() |
Generate random US phone number |
group_replication_get_communication_protocol() |
Return Group Replication protocol version |
group_replication_get_write_concurrency() |
Return maximum number of consensus instances executable in parallel |
group_replication_set_as_primary() |
Assign group member as new primary |
group_replication_set_communication_protocol() |
Set Group Replication protocol version |
group_replication_set_write_concurrency() |
Set maximum number of consensus instances executable in parallel |
group_replication_switch_to_multi_primary_mode() |
Change group from single-primary to multi-primary mode |
group_replication_switch_to_single_primary_mode() |
Change group from multi-primary to single-primary mode |
keyring_aws_rotate_cmk() |
Rotate AWS customer master key |
keyring_aws_rotate_keys() |
Rotate keys in keyring_aws storage file |
keyring_hashicorp_update_config() |
Cause runtime keyring_hashicorp reconfiguration |
keyring_key_fetch() |
Fetch keyring key value |
keyring_key_generate() |
Generate random keyring key |
keyring_key_length_fetch() |
Return keyring key length |
keyring_key_remove() |
Remove keyring key |
keyring_key_store() |
Store key in keyring |
keyring_key_type_fetch() |
Return keyring key type |
load_rewrite_rules() |
Rewriter plugin helper routine |
mask_inner() |
Mask interior part of string |
mask_outer() |
Mask left and right parts of string |
mask_pan() |
Mask payment card Primary Account Number part of string |
mask_pan_relaxed() |
Mask payment card Primary Account Number part of string |
mask_ssn() |
Mask US Social Security number |
mysql_firewall_flush_status() |
Reset firewall status variables |
normalize_statement() |
Normalize SQL statement to digest form |
read_firewall_users() |
Update firewall account profile cache |
read_firewall_whitelist() |
Update firewall account profile recorded-statement cache |
service_get_read_locks() |
Acquire locking service shared locks |
service_get_write_locks() |
Acquire locking service exclusive locks |
service_release_locks() |
Release locking service locks |
set_firewall_mode() |
Establish firewall account profile operational mode |
version_tokens_delete() |
Delete tokens from version tokens list |
version_tokens_edit() |
Modify version tokens list |
version_tokens_lock_exclusive() |
Acquire exclusive locks on version tokens |
version_tokens_lock_shared() |
Acquire shared locks on version tokens |
version_tokens_set() |
Set version tokens list |
version_tokens_show() |
Return version tokens list |
version_tokens_unlock() |
Release version tokens locks |
When an operator is used with operands of different types, type conversion occurs to make the operands compatible. Some conversions occur implicitly. For example, MySQL automatically converts strings to numbers as necessary, and vice versa.
mysql>SELECT 1+'1';
-> 2 mysql>SELECT CONCAT(2,' test');
-> '2 test'
It is also possible to convert a number to a string explicitly
using the CAST()
function.
Conversion occurs implicitly with the
CONCAT()
function because it
expects string arguments.
mysql>SELECT 38.8, CAST(38.8 AS CHAR);
-> 38.8, '38.8' mysql>SELECT 38.8, CONCAT(38.8);
-> 38.8, '38.8'
See later in this section for information about the character set
of implicit number-to-string conversions, and for modified rules
that apply to CREATE TABLE ... SELECT
statements.
The following rules describe how conversion occurs for comparison operations:
If one or both arguments are
NULL
, the result of the comparison isNULL
, except for theNULL
-safe<=>
equality comparison operator. ForNULL <=> NULL
, the result is true. No conversion is needed.If both arguments in a comparison operation are strings, they are compared as strings.
If both arguments are integers, they are compared as integers.
Hexadecimal values are treated as binary strings if not compared to a number.
If one of the arguments is a
TIMESTAMP
orDATETIME
column and the other argument is a constant, the constant is converted to a timestamp before the comparison is performed. This is done to be more ODBC-friendly. This is not done for the arguments toIN()
. To be safe, always use complete datetime, date, or time strings when doing comparisons. For example, to achieve best results when usingBETWEEN
with date or time values, useCAST()
to explicitly convert the values to the desired data type.A single-row subquery from a table or tables is not considered a constant. For example, if a subquery returns an integer to be compared to a
DATETIME
value, the comparison is done as two integers. The integer is not converted to a temporal value. To compare the operands asDATETIME
values, useCAST()
to explicitly convert the subquery value toDATETIME
.If one of the arguments is a decimal value, comparison depends on the other argument. The arguments are compared as decimal values if the other argument is a decimal or integer value, or as floating-point values if the other argument is a floating-point value.
In all other cases, the arguments are compared as floating-point (real) numbers. For example, a comparison of string and numeric operands takes place as a comparison of floating-point numbers.
For information about conversion of values from one temporal type to another, see Section 11.2.7, “Conversion Between Date and Time Types”.
Comparison of JSON values takes place at two levels. The first level of comparison is based on the JSON types of the compared values. If the types differ, the comparison result is determined solely by which type has higher precedence. If the two values have the same JSON type, a second level of comparison occurs using type-specific rules. For comparison of JSON and non-JSON values, the non-JSON value is converted to JSON and the values compared as JSON values. For details, see Comparison and Ordering of JSON Values.
The following examples illustrate conversion of strings to numbers for comparison operations:
mysql>SELECT 1 > '6x';
-> 0 mysql>SELECT 7 > '6x';
-> 1 mysql>SELECT 0 > 'x6';
-> 0 mysql>SELECT 0 = 'x6';
-> 1
For comparisons of a string column with a number, MySQL cannot use
an index on the column to look up the value quickly. If
str_col
is an indexed string column,
the index cannot be used when performing the lookup in the
following statement:
SELECT * FROMtbl_name
WHEREstr_col
=1;
The reason for this is that there are many different strings that
may convert to the value 1
, such as
'1'
, ' 1'
, or
'1a'
.
Comparisons between floating-point numbers and large values of
INTEGER
type are approximate because the
integer is converted to double-precision floating point before
comparison, which is not capable of representing all 64-bit
integers exactly. For example, the integer value
253 + 1 is not representable as a
float, and is rounded to 253 or
253 + 2 before a float comparison,
depending on the platform.
To illustrate, only the first of the following comparisons compares equal values, but both comparisons return true (1):
mysql>SELECT '9223372036854775807' = 9223372036854775807;
-> 1 mysql>SELECT '9223372036854775807' = 9223372036854775806;
-> 1
When conversions from string to floating-point and from integer to
floating-point occur, they do not necessarily occur the same way.
The integer may be converted to floating-point by the CPU, whereas
the string is converted digit by digit in an operation that
involves floating-point multiplications. Also, results can be
affected by factors such as computer architecture or the compiler
version or optimization level. One way to avoid such problems is
to use CAST()
so that a value is
not converted implicitly to a float-point number:
mysql> SELECT CAST('9223372036854775807' AS UNSIGNED) = 9223372036854775806;
-> 0
For more information about floating-point comparisons, see Section B.3.4.8, “Problems with Floating-Point Values”.
The server includes dtoa
, a conversion library
that provides the basis for improved conversion between string or
DECIMAL
values and
approximate-value
(FLOAT
/DOUBLE
)
numbers:
Consistent conversion results across platforms, which eliminates, for example, Unix versus Windows conversion differences.
Accurate representation of values in cases where results previously did not provide sufficient precision, such as for values close to IEEE limits.
Conversion of numbers to string format with the best possible precision. The precision of
dtoa
is always the same or better than that of the standard C library functions.
Because the conversions produced by this library differ in some
cases from non-dtoa
results, the potential
exists for incompatibilities in applications that rely on previous
results. For example, applications that depend on a specific exact
result from previous conversions might need adjustment to
accommodate additional precision.
The dtoa
library provides conversions with the
following properties. D
represents a
value with a DECIMAL
or string
representation, and F
represents a
floating-point number in native binary (IEEE) format.
F
->D
conversion is done with the best possible precision, returningD
as the shortest string that yieldsF
when read back in and rounded to the nearest value in native binary format as specified by IEEE.D
->F
conversion is done such thatF
is the nearest native binary number to the input decimal stringD
.
These properties imply that F
->
D
-> F
conversions are lossless unless F
is
-inf
, +inf
, or
NaN
. The latter values are not supported
because the SQL standard defines them as invalid values for
FLOAT
or
DOUBLE
.
For D
->
F
-> D
conversions, a sufficient condition for losslessness is that
D
uses 15 or fewer digits of precision,
is not a denormal value, -inf
,
+inf
, or NaN
. In some cases,
the conversion is lossless even if D
has more than 15 digits of precision, but this is not always the
case.
Implicit conversion of a numeric or temporal value to string
produces a value that has a character set and collation determined
by the character_set_connection
and collation_connection
system
variables. (These variables commonly are set with
SET NAMES
. For information about
connection character sets, see
Section 10.4, “Connection Character Sets and Collations”.)
This means that such a conversion results in a character
(nonbinary) string (a CHAR
,
VARCHAR
, or
LONGTEXT
value), except in the case
that the connection character set is set to
binary
. In that case, the conversion result is
a binary string (a BINARY
,
VARBINARY
, or
LONGBLOB
value).
For integer expressions, the preceding remarks about expression evaluation apply somewhat differently for expression assignment; for example, in a statement such as this:
CREATE TABLE t SELECT integer_expr
;
In this case, the table in the column resulting from the
expression has type INT
or
BIGINT
depending on the length of
the integer expression. If the maximum length of the expression
does not fit in an INT
,
BIGINT
is used instead. The length
is taken from the max_length
value of the
SELECT
result set metadata (see
C API Data Structures). This means that you can
force a BIGINT
rather than
INT
by use of a sufficiently long
expression:
CREATE TABLE t SELECT 000000000000000000000;
Table 12.3 Operators
Name | Description |
---|---|
& |
Bitwise AND |
> |
Greater than operator |
>> |
Right shift |
>= |
Greater than or equal operator |
< |
Less than operator |
<> , != |
Not equal operator |
<< |
Left shift |
<= |
Less than or equal operator |
<=> |
NULL-safe equal to operator |
% , MOD |
Modulo operator |
* |
Multiplication operator |
+ |
Addition operator |
- |
Minus operator |
- |
Change the sign of the argument |
-> |
Return value from JSON column after evaluating path; equivalent to JSON_EXTRACT(). |
->> |
Return value from JSON column after evaluating path and unquoting the result; equivalent to JSON_UNQUOTE(JSON_EXTRACT()). |
/ |
Division operator |
:= |
Assign a value |
= |
Assign a value (as part of a
SET
statement, or as part of the SET clause in an
UPDATE statement)
|
= |
Equal operator |
^ |
Bitwise XOR |
AND , && |
Logical AND |
BETWEEN ... AND ... |
Whether a value is within a range of values |
BINARY |
Cast a string to a binary string |
CASE |
Case operator |
DIV |
Integer division |
IN() |
Whether a value is within a set of values |
IS |
Test a value against a boolean |
IS NOT |
Test a value against a boolean |
IS NOT NULL |
NOT NULL value test |
IS NULL |
NULL value test |
LIKE |
Simple pattern matching |
MEMBER OF() (introduced 8.0.17) |
Returns true (1) if first operand matches any element of JSON array passed as second operand, otherwise returns false (0) |
NOT , ! |
Negates value |
NOT BETWEEN ... AND ... |
Whether a value is not within a range of values |
NOT IN() |
Whether a value is not within a set of values |
NOT LIKE |
Negation of simple pattern matching |
NOT REGEXP |
Negation of REGEXP |
OR , || |
Logical OR |
REGEXP |
Whether string matches regular expression |
RLIKE |
Whether string matches regular expression |
SOUNDS LIKE |
Compare sounds |
XOR |
Logical XOR |
| |
Bitwise OR |
~ |
Bitwise inversion |
Operator precedences are shown in the following list, from highest precedence to the lowest. Operators that are shown together on a line have the same precedence.
INTERVAL BINARY, COLLATE ! - (unary minus), ~ (unary bit inversion) ^ *, /, DIV, %, MOD -, + <<, >> & | = (comparison), <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, IN, MEMBER OF BETWEEN, CASE, WHEN, THEN, ELSE NOT AND, && XOR OR, || = (assignment), :=
The precedence of =
depends on whether it is
used as a comparison operator
(=
) or as an
assignment operator
(=
). When
used as a comparison operator, it has the same precedence as
<=>
,
>=
,
>
,
<=
,
<
,
<>
,
!=
,
IS
,
LIKE
,
REGEXP
, and
IN()
. When used as an assignment
operator, it has the same precedence as
:=
.
Section 13.7.6.1, “SET Syntax for Variable Assignment”, and
Section 9.4, “User-Defined Variables”, explain how MySQL determines
which interpretation of =
should apply.
For operators that occur at the same precedence level within an expression, evaluation proceeds left to right, with the exception that assignments evaluate right to left.
The precedence and meaning of some operators depends on the SQL mode:
By default,
||
is a logicalOR
operator. WithPIPES_AS_CONCAT
enabled,||
is string concatenation, with a precedence between^
and the unary operators.By default,
!
has a higher precedence thanNOT
. WithHIGH_NOT_PRECEDENCE
enabled,!
andNOT
have the same precedence.
See Section 5.1.11, “Server SQL Modes”.
The precedence of operators determines the order of evaluation of terms in an expression. To override this order and group terms explicitly, use parentheses. For example:
mysql>SELECT 1+2*3;
-> 7 mysql>SELECT (1+2)*3;
-> 9
Table 12.4 Comparison Operators
Name | Description |
---|---|
> |
Greater than operator |
>= |
Greater than or equal operator |
< |
Less than operator |
<> , != |
Not equal operator |
<= |
Less than or equal operator |
<=> |
NULL-safe equal to operator |
= |
Equal operator |
BETWEEN ... AND ... |
Whether a value is within a range of values |
COALESCE() |
Return the first non-NULL argument |
GREATEST() |
Return the largest argument |
IN() |
Whether a value is within a set of values |
INTERVAL() |
Return the index of the argument that is less than the first argument |
IS |
Test a value against a boolean |
IS NOT |
Test a value against a boolean |
IS NOT NULL |
NOT NULL value test |
IS NULL |
NULL value test |
ISNULL() |
Test whether the argument is NULL |
LEAST() |
Return the smallest argument |
LIKE |
Simple pattern matching |
NOT BETWEEN ... AND ... |
Whether a value is not within a range of values |
NOT IN() |
Whether a value is not within a set of values |
NOT LIKE |
Negation of simple pattern matching |
STRCMP() |
Compare two strings |
Comparison operations result in a value of 1
(TRUE
), 0
(FALSE
), or NULL
. These
operations work for both numbers and strings. Strings are
automatically converted to numbers and numbers to strings as
necessary.
The following relational comparison operators can be used to compare not only scalar operands, but row operands:
= > < >= <= <> !=
The descriptions for those operators later in this section detail how they work with row operands. For additional examples of row comparisons in the context of row subqueries, see Section 13.2.11.5, “Row Subqueries”.
Some of the functions in this section return values other than
1
(TRUE
),
0
(FALSE
), or
NULL
. LEAST()
and GREATEST()
are examples of
such functions; Section 12.3, “Type Conversion in Expression Evaluation”, describes the
rules for comparison operations performed by these and similar
functions for determining their return values.
In previous versions of MySQL, when evaluating an expression
containing LEAST()
or
GREATEST()
, the server attempted to guess
the context in which the function was used, and to coerce the
function's arguments to the data type of the expression
as a whole. For example, the arguments to LEAST("11",
"45", "2")
are evaluated and sorted as strings, so
that this expression returns "11"
. In MySQL
8.0.3 and earlier, when evaluating the expression
LEAST("11", "45", "2") + 0
, the server
converted the arguments to integers (anticipating the addition
of integer 0 to the result) before sorting them, thus
returning 2.
Beginning with MySQL 8.0.4, the server no longer attempts to
infer context in this fashion. Instead, the function is
executed using the arguments as provided, performing data type
conversions to one or more of the arguments if and only if
they are not all of the same type. Any type coercion mandated
by an expression that makes use of the return value is now
performed following function execution. This means that, in
MySQl 8.0.4 and later, LEAST("11", "45", "2") +
0
evaluates to "11" + 0
and thus
to integer 11. (Bug #83895, Bug #25123839)
To convert a value to a specific type for comparison purposes,
you can use the CAST()
function.
String values can be converted to a different character set
using CONVERT()
. See
Section 12.11, “Cast Functions and Operators”.
By default, string comparisons are not case-sensitive and use
the current character set. The default is
utf8mb4
.
Equal:
mysql>
SELECT 1 = 0;
-> 0 mysql>SELECT '0' = 0;
-> 1 mysql>SELECT '0.0' = 0;
-> 1 mysql>SELECT '0.01' = 0;
-> 0 mysql>SELECT '.01' = 0.01;
-> 1For row comparisons,
(a, b) = (x, y)
is equivalent to:(a = x) AND (b = y)
NULL
-safe equal. This operator performs an equality comparison like the=
operator, but returns1
rather thanNULL
if both operands areNULL
, and0
rather thanNULL
if one operand isNULL
.The
<=>
operator is equivalent to the standard SQLIS NOT DISTINCT FROM
operator.mysql>
SELECT 1 <=> 1, NULL <=> NULL, 1 <=> NULL;
-> 1, 1, 0 mysql>SELECT 1 = 1, NULL = NULL, 1 = NULL;
-> 1, NULL, NULLFor row comparisons,
(a, b) <=> (x, y)
is equivalent to:(a <=> x) AND (b <=> y)
Not equal:
mysql>
SELECT '.01' <> '0.01';
-> 1 mysql>SELECT .01 <> '0.01';
-> 0 mysql>SELECT 'zapp' <> 'zappp';
-> 1For row comparisons,
(a, b) <> (x, y)
and(a, b) != (x, y)
are equivalent to:(a <> x) OR (b <> y)
Less than or equal:
mysql>
SELECT 0.1 <= 2;
-> 1For row comparisons,
(a, b) <= (x, y)
is equivalent to:(a < x) OR ((a = x) AND (b <= y))
Less than:
mysql>
SELECT 2 < 2;
-> 0For row comparisons,
(a, b) < (x, y)
is equivalent to:(a < x) OR ((a = x) AND (b < y))
Greater than or equal:
mysql>
SELECT 2 >= 2;
-> 1For row comparisons,
(a, b) >= (x, y)
is equivalent to:(a > x) OR ((a = x) AND (b >= y))
Greater than:
mysql>
SELECT 2 > 2;
-> 0For row comparisons,
(a, b) > (x, y)
is equivalent to:(a > x) OR ((a = x) AND (b > y))
If
expr
is greater than or equal tomin
andexpr
is less than or equal tomax
,BETWEEN
returns1
, otherwise it returns0
. This is equivalent to the expression(
if all the arguments are of the same type. Otherwise type conversion takes place according to the rules described in Section 12.3, “Type Conversion in Expression Evaluation”, but applied to all the three arguments.min
<=expr
ANDexpr
<=max
)mysql>
SELECT 2 BETWEEN 1 AND 3, 2 BETWEEN 3 and 1;
-> 1, 0 mysql>SELECT 1 BETWEEN 2 AND 3;
-> 0 mysql>SELECT 'b' BETWEEN 'a' AND 'c';
-> 1 mysql>SELECT 2 BETWEEN 2 AND '3';
-> 1 mysql>SELECT 2 BETWEEN 2 AND 'x-3';
-> 0For best results when using
BETWEEN
with date or time values, useCAST()
to explicitly convert the values to the desired data type. Examples: If you compare aDATETIME
to twoDATE
values, convert theDATE
values toDATETIME
values. If you use a string constant such as'2001-1-1'
in a comparison to aDATE
, cast the string to aDATE
.This is the same as
NOT (
.expr
BETWEENmin
ANDmax
)Returns the first non-
NULL
value in the list, orNULL
if there are no non-NULL
values.The return type of
COALESCE()
is the aggregated type of the argument types.mysql>
SELECT COALESCE(NULL,1);
-> 1 mysql>SELECT COALESCE(NULL,NULL,NULL);
-> NULLWith two or more arguments, returns the largest (maximum-valued) argument. The arguments are compared using the same rules as for
LEAST()
.mysql>
SELECT GREATEST(2,0);
-> 2 mysql>SELECT GREATEST(34.0,3.0,5.0,767.0);
-> 767.0 mysql>SELECT GREATEST('B','A','C');
-> 'C'GREATEST()
returnsNULL
if any argument isNULL
.Returns
1
(true) ifexpr
is equal to any of the values in theIN()
list, else returns0
(false).Type conversion takes place according to the rules described in Section 12.3, “Type Conversion in Expression Evaluation”, applied to all the arguments. If no type conversion is needed for the values in the
IN()
list, they are all non-JSON
constants of the same type, andexpr
can be compared to each of them as a value of the same type (possibly after type conversion), an optimization takes place. The values the list are sorted and the search forexpr
is done using a binary search, which makes theIN()
operation very quick.mysql>
SELECT 2 IN (0,3,5,7);
-> 0 mysql>SELECT 'wefwf' IN ('wee','wefwf','weg');
-> 1IN()
can be used to compare row constructors:mysql>
SELECT (3,4) IN ((1,2), (3,4));
-> 1 mysql>SELECT (3,4) IN ((1,2), (3,5));
-> 0You should never mix quoted and unquoted values in an
IN()
list because the comparison rules for quoted values (such as strings) and unquoted values (such as numbers) differ. Mixing types may therefore lead to inconsistent results. For example, do not write anIN()
expression like this:SELECT val1 FROM tbl1 WHERE val1 IN (1,2,'a');
Instead, write it like this:
SELECT val1 FROM tbl1 WHERE val1 IN ('1','2','a');
Implicit type conversion may produce nonintuitive results:
mysql>
SELECT 'a' IN (0), 0 IN ('b');
-> 1, 1In both cases, the comparison values are converted to floating-point values, yielding 0.0 in each case, and a comparison result of 1 (true).
The number of values in the
IN()
list is only limited by themax_allowed_packet
value.To comply with the SQL standard,
IN()
returnsNULL
not only if the expression on the left hand side isNULL
, but also if no match is found in the list and one of the expressions in the list isNULL
.IN()
syntax can also be used to write certain types of subqueries. See Section 13.2.11.3, “Subqueries with ANY, IN, or SOME”.This is the same as
NOT (
.expr
IN (value
,...))Returns
0
ifN
<N1
,1
ifN
<N2
and so on or-1
ifN
isNULL
. All arguments are treated as integers. It is required thatN1
<N2
<N3
<...
<Nn
for this function to work correctly. This is because a binary search is used (very fast).mysql>
SELECT INTERVAL(23, 1, 15, 17, 30, 44, 200);
-> 3 mysql>SELECT INTERVAL(10, 1, 10, 100, 1000);
-> 2 mysql>SELECT INTERVAL(22, 23, 30, 44, 200);
-> 0Tests a value against a boolean value, where
boolean_value
can beTRUE
,FALSE
, orUNKNOWN
.mysql>
SELECT 1 IS TRUE, 0 IS FALSE, NULL IS UNKNOWN;
-> 1, 1, 1Tests a value against a boolean value, where
boolean_value
can beTRUE
,FALSE
, orUNKNOWN
.mysql>
SELECT 1 IS NOT UNKNOWN, 0 IS NOT UNKNOWN, NULL IS NOT UNKNOWN;
-> 1, 1, 0Tests whether a value is
NULL
.mysql>
SELECT 1 IS NULL, 0 IS NULL, NULL IS NULL;
-> 0, 0, 1To work well with ODBC programs, MySQL supports the following extra features when using
IS NULL
:If
sql_auto_is_null
variable is set to 1, then after a statement that successfully inserts an automatically generatedAUTO_INCREMENT
value, you can find that value by issuing a statement of the following form:SELECT * FROM
tbl_name
WHEREauto_col
IS NULLIf the statement returns a row, the value returned is the same as if you invoked the
LAST_INSERT_ID()
function. For details, including the return value after a multiple-row insert, see Section 12.16, “Information Functions”. If noAUTO_INCREMENT
value was successfully inserted, theSELECT
statement returns no row.The behavior of retrieving an
AUTO_INCREMENT
value by using anIS NULL
comparison can be disabled by settingsql_auto_is_null = 0
. See Section 5.1.8, “Server System Variables”.The default value of
sql_auto_is_null
is 0.For
DATE
andDATETIME
columns that are declared asNOT NULL
, you can find the special date'0000-00-00'
by using a statement like this:SELECT * FROM
tbl_name
WHEREdate_column
IS NULLThis is needed to get some ODBC applications to work because ODBC does not support a
'0000-00-00'
date value.See Obtaining Auto-Increment Values, and the description for the
FLAG_AUTO_IS_NULL
option at Connector/ODBC Connection Parameters.
Tests whether a value is not
NULL
.mysql>
SELECT 1 IS NOT NULL, 0 IS NOT NULL, NULL IS NOT NULL;
-> 1, 1, 0If
expr
isNULL
,ISNULL()
returns1
, otherwise it returns0
.mysql>
SELECT ISNULL(1+1);
-> 0 mysql>SELECT ISNULL(1/0);
-> 1ISNULL()
can be used instead of=
to test whether a value isNULL
. (Comparing a value toNULL
using=
always yieldsNULL
.)The
ISNULL()
function shares some special behaviors with theIS NULL
comparison operator. See the description ofIS NULL
.With two or more arguments, returns the smallest (minimum-valued) argument. The arguments are compared using the following rules:
If any argument is
NULL
, the result isNULL
. No comparison is needed.If all arguments are integer-valued, they are compared as integers.
If at least one argument is double precision, they are compared as double-precision values. Otherwise, if at least one argument is a
DECIMAL
value, they are compared asDECIMAL
values.If the arguments comprise a mix of numbers and strings, they are compared as strings.
If any argument is a nonbinary (character) string, the arguments are compared as nonbinary strings.
In all other cases, the arguments are compared as binary strings.
The return type of
LEAST()
is the aggregated type of the comparison argument types.mysql>
SELECT LEAST(2,0);
-> 0 mysql>SELECT LEAST(34.0,3.0,5.0,767.0);
-> 3.0 mysql>SELECT LEAST('B','A','C');
-> 'A'
In SQL, all logical operators evaluate to
TRUE
, FALSE
, or
NULL
(UNKNOWN
). In MySQL,
these are implemented as 1 (TRUE
), 0
(FALSE
), and NULL
. Most of
this is common to different SQL database servers, although some
servers may return any nonzero value for
TRUE
.
MySQL evaluates any nonzero, non-NULL
value
to TRUE
. For example, the following
statements all assess to TRUE
:
mysql>SELECT 10 IS TRUE;
-> 1 mysql>SELECT -10 IS TRUE;
-> 1 mysql>SELECT 'string' IS NOT NULL;
-> 1
Logical NOT. Evaluates to
1
if the operand is0
, to0
if the operand is nonzero, andNOT NULL
returnsNULL
.mysql>
SELECT NOT 10;
-> 0 mysql>SELECT NOT 0;
-> 1 mysql>SELECT NOT NULL;
-> NULL mysql>SELECT ! (1+1);
-> 0 mysql>SELECT ! 1+1;
-> 1The last example produces
1
because the expression evaluates the same way as(!1)+1
.The
!
, operator is a nonstandard MySQL extension. As of MySQL 8.0.17, this operator is deprecated; expect it to be removed in a future version of MySQL. Applications should be adjusted to use the standard SQLNOT
operator.Logical AND. Evaluates to
1
if all operands are nonzero and notNULL
, to0
if one or more operands are0
, otherwiseNULL
is returned.mysql>
SELECT 1 AND 1;
-> 1 mysql>SELECT 1 AND 0;
-> 0 mysql>SELECT 1 AND NULL;
-> NULL mysql>SELECT 0 AND NULL;
-> 0 mysql>SELECT NULL AND 0;
-> 0The
&&
, operator is a nonstandard MySQL extension. As of MySQL 8.0.17, this operator is deprecated; expect support for it to be removed in a future version of MySQL. Applications should be adjusted to use the standard SQLAND
operator.Logical OR. When both operands are non-
NULL
, the result is1
if any operand is nonzero, and0
otherwise. With aNULL
operand, the result is1
if the other operand is nonzero, andNULL
otherwise. If both operands areNULL
, the result isNULL
.mysql>
SELECT 1 OR 1;
-> 1 mysql>SELECT 1 OR 0;
-> 1 mysql>SELECT 0 OR 0;
-> 0 mysql>SELECT 0 OR NULL;
-> NULL mysql>SELECT 1 OR NULL;
-> 1NoteIf the
PIPES_AS_CONCAT
SQL mode is enabled,||
signifies the SQL-standard string concatenation operator (likeCONCAT()
).The
||
, operator is a nonstandard MySQL extension. As of MySQL 8.0.17, this operator is deprecated; expect support for it to be removed in a future version of MySQL. Applications should be adjusted to use the standard SQLOR
operator. Exception: Deprecation does not apply ifPIPES_AS_CONCAT
is enabled because, in that case,||
signifies string concatentation.Logical XOR. Returns
NULL
if either operand isNULL
. For non-NULL
operands, evaluates to1
if an odd number of operands is nonzero, otherwise0
is returned.mysql>
SELECT 1 XOR 1;
-> 0 mysql>SELECT 1 XOR 0;
-> 1 mysql>SELECT 1 XOR NULL;
-> NULL mysql>SELECT 1 XOR 1 XOR 1;
-> 1a XOR b
is mathematically equal to(a AND (NOT b)) OR ((NOT a) and b)
.
Assignment operator. Causes the user variable on the left hand side of the operator to take on the value to its right. The value on the right hand side may be a literal value, another variable storing a value, or any legal expression that yields a scalar value, including the result of a query (provided that this value is a scalar value). You can perform multiple assignments in the same
SET
statement. You can perform multiple assignments in the same statement.Unlike
=
, the:=
operator is never interpreted as a comparison operator. This means you can use:=
in any valid SQL statement (not just inSET
statements) to assign a value to a variable.mysql>
SELECT @var1, @var2;
-> NULL, NULL mysql>SELECT @var1 := 1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2 := @var1;
-> 1, 1 mysql>SELECT @var1, @var2;
-> 1, 1 mysql>SELECT @var1:=COUNT(*) FROM t1;
-> 4 mysql>SELECT @var1;
-> 4You can make value assignments using
:=
in other statements besidesSELECT
, such asUPDATE
, as shown here:mysql>
SELECT @var1;
-> 4 mysql>SELECT * FROM t1;
-> 1, 3, 5, 7 mysql>UPDATE t1 SET c1 = 2 WHERE c1 = @var1:= 1;
Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql>SELECT @var1;
-> 1 mysql>SELECT * FROM t1;
-> 2, 3, 5, 7While it is also possible both to set and to read the value of the same variable in a single SQL statement using the
:=
operator, this is not recommended. Section 9.4, “User-Defined Variables”, explains why you should avoid doing this.This operator is used to perform value assignments in two cases, described in the next two paragraphs.
Within a
SET
statement,=
is treated as an assignment operator that causes the user variable on the left hand side of the operator to take on the value to its right. (In other words, when used in aSET
statement,=
is treated identically to:=
.) The value on the right hand side may be a literal value, another variable storing a value, or any legal expression that yields a scalar value, including the result of a query (provided that this value is a scalar value). You can perform multiple assignments in the sameSET
statement.In the
SET
clause of anUPDATE
statement,=
also acts as an assignment operator; in this case, however, it causes the column named on the left hand side of the operator to assume the value given to the right, provided anyWHERE
conditions that are part of theUPDATE
are met. You can make multiple assignments in the sameSET
clause of anUPDATE
statement.In any other context,
=
is treated as a comparison operator.mysql>
SELECT @var1, @var2;
-> NULL, NULL mysql>SELECT @var1 := 1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2;
-> 1, NULL mysql>SELECT @var1, @var2 := @var1;
-> 1, 1 mysql>SELECT @var1, @var2;
-> 1, 1For more information, see Section 13.7.6.1, “SET Syntax for Variable Assignment”, Section 13.2.13, “UPDATE Statement”, and Section 13.2.11, “Subqueries”.
CASE
value
WHENcompare_value
THENresult
[WHENcompare_value
THENresult
...] [ELSEresult
] ENDCASE WHEN
condition
THENresult
[WHENcondition
THENresult
...] [ELSEresult
] ENDThe first
CASE
syntax returns theresult
for the first
comparison that is true. The second syntax returns the result for the first condition that is true. If no comparison or condition is true, the result aftervalue
=compare_value
ELSE
is returned, orNULL
if there is noELSE
part.NoteThe syntax of the
CASE
operator described here differs slightly from that of the SQLCASE
statement described in Section 13.6.5.1, “CASE Statement”, for use inside stored programs. TheCASE
statement cannot have anELSE NULL
clause, and it is terminated withEND CASE
instead ofEND
.The return type of a
CASE
expression result is the aggregated type of all result values:If all types are numeric, the aggregated type is also numeric:
If at least one argument is double precision, the result is double precision.
Otherwise, if at least one argument is
DECIMAL
, the result isDECIMAL
.Otherwise, the result is an integer type (with one exception):
If all integer types are all signed or all unsigned, the result is the same sign and the precision is the highest of all specified integer types (that is,
TINYINT
,SMALLINT
,MEDIUMINT
,INT
, orBIGINT
).If there is a combination of signed and unsigned integer types, the result is signed and the precision may be higher. For example, if the types are signed
INT
and unsignedINT
, the result is signedBIGINT
.The exception is unsigned
BIGINT
combined with any signed integer type. The result isDECIMAL
with sufficient precision and scale 0.
If all types are
BIT
, the result isBIT
. Otherwise,BIT
arguments are treated similar toBIGINT
.If all types are
YEAR
, the result isYEAR
. Otherwise,YEAR
arguments are treated similar toINT
.If all types are character string (
CHAR
orVARCHAR
), the result isVARCHAR
with maximum length determined by the longest character length of the operands.If all types are character or binary string, the result is
VARBINARY
.SET
andENUM
are treated similar toVARCHAR
; the result isVARCHAR
.If all types are temporal, the result is temporal:
If all types are
GEOMETRY
, the result isGEOMETRY
.For all other type combinations, the result is
VARCHAR
.Literal
NULL
operands are ignored for type aggregation.
mysql>
SELECT CASE 1 WHEN 1 THEN 'one'
->WHEN 2 THEN 'two' ELSE 'more' END;
-> 'one' mysql>SELECT CASE WHEN 1>0 THEN 'true' ELSE 'false' END;
-> 'true' mysql>SELECT CASE BINARY 'B'
->WHEN 'a' THEN 1 WHEN 'b' THEN 2 END;
-> NULLIf
expr1
isTRUE
(
andexpr1
<> 0
),expr1
<> NULLIF()
returnsexpr2
. Otherwise, it returnsexpr3
.NoteThere is also an
IF
statement, which differs from theIF()
function described here. See Section 13.6.5.2, “IF Statement”.If only one of
expr2
orexpr3
is explicitlyNULL
, the result type of theIF()
function is the type of the non-NULL
expression.The default return type of
IF()
(which may matter when it is stored into a temporary table) is calculated as follows:If
expr2
orexpr3
produce a string, the result is a string.If
expr2
andexpr3
are both strings, the result is case-sensitive if either string is case-sensitive.If
expr2
orexpr3
produce a floating-point value, the result is a floating-point value.If
expr2
orexpr3
produce an integer, the result is an integer.
mysql>
SELECT IF(1>2,2,3);
-> 3 mysql>SELECT IF(1<2,'yes','no');
-> 'yes' mysql>SELECT IF(STRCMP('test','test1'),'no','yes');
-> 'no'If
expr1
is notNULL
,IFNULL()
returnsexpr1
; otherwise it returnsexpr2
.mysql>
SELECT IFNULL(1,0);
-> 1 mysql>SELECT IFNULL(NULL,10);
-> 10 mysql>SELECT IFNULL(1/0,10);
-> 10 mysql>SELECT IFNULL(1/0,'yes');
-> 'yes'The default return type of
IFNULL(
is the more “general” of the two expressions, in the orderexpr1
,expr2
)STRING
,REAL
, orINTEGER
. Consider the case of a table based on expressions or where MySQL must internally store a value returned byIFNULL()
in a temporary table:mysql>
CREATE TABLE tmp SELECT IFNULL(1,'test') AS test;
mysql>DESCRIBE tmp;
+-------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+--------------+------+-----+---------+-------+ | test | varbinary(4) | NO | | | | +-------+--------------+------+-----+---------+-------+In this example, the type of the
test
column isVARBINARY(4)
(a string type).Returns
NULL
if
is true, otherwise returnsexpr1
=expr2
expr1
. This is the same asCASE WHEN
.expr1
=expr2
THEN NULL ELSEexpr1
ENDThe return value has the same type as the first argument.
mysql>
SELECT NULLIF(1,1);
-> NULL mysql>SELECT NULLIF(1,2);
-> 1NoteMySQL evaluates
expr1
twice if the arguments are not equal.
Table 12.8 Numeric Functions and Operators
Name | Description |
---|---|
% , MOD |
Modulo operator |
* |
Multiplication operator |
+ |
Addition operator |
- |
Minus operator |
- |
Change the sign of the argument |
/ |
Division operator |
ABS() |
Return the absolute value |
ACOS() |
Return the arc cosine |
ASIN() |
Return the arc sine |
ATAN() |
Return the arc tangent |
ATAN2() , ATAN() |
Return the arc tangent of the two arguments |
CEIL() |
Return the smallest integer value not less than the argument |
CEILING() |
Return the smallest integer value not less than the argument |
CONV() |
Convert numbers between different number bases |
COS() |
Return the cosine |
COT() |
Return the cotangent |
CRC32() |
Compute a cyclic redundancy check value |
DEGREES() |
Convert radians to degrees |
DIV |
Integer division |
EXP() |
Raise to the power of |
FLOOR() |
Return the largest integer value not greater than the argument |
LN() |
Return the natural logarithm of the argument |
LOG() |
Return the natural logarithm of the first argument |
LOG10() |
Return the base-10 logarithm of the argument |
LOG2() |
Return the base-2 logarithm of the argument |
MOD() |
Return the remainder |
PI() |
Return the value of pi |
POW() |
Return the argument raised to the specified power |
POWER() |
Return the argument raised to the specified power |
RADIANS() |
Return argument converted to radians |
RAND() |
Return a random floating-point value |
ROUND() |
Round the argument |
SIGN() |
Return the sign of the argument |
SIN() |
Return the sine of the argument |
SQRT() |
Return the square root of the argument |
TAN() |
Return the tangent of the argument |
TRUNCATE() |
Truncate to specified number of decimal places |
The usual arithmetic operators are available. The result is determined according to the following rules:
In the case of
-
,+
, and*
, the result is calculated withBIGINT
(64-bit) precision if both operands are integers.If both operands are integers and any of them are unsigned, the result is an unsigned integer. For subtraction, if the
NO_UNSIGNED_SUBTRACTION
SQL mode is enabled, the result is signed even if any operand is unsigned.If any of the operands of a
+
,-
,/
,*
,%
is a real or string value, the precision of the result is the precision of the operand with the maximum precision.In division performed with
/
, the scale of the result when using two exact-value operands is the scale of the first operand plus the value of thediv_precision_increment
system variable (which is 4 by default). For example, the result of the expression5.05 / 0.014
has a scale of six decimal places (360.714286
).
These rules are applied for each operation, such that nested
calculations imply the precision of each component. Hence,
(14620 / 9432456) / (24250 / 9432456)
,
resolves first to (0.0014) / (0.0026)
, with
the final result having 8 decimal places
(0.60288653
).
Because of these rules and the way they are applied, care should be taken to ensure that components and subcomponents of a calculation use the appropriate level of precision. See Section 12.11, “Cast Functions and Operators”.
For information about handling of overflow in numeric expression evaluation, see Section 11.1.7, “Out-of-Range and Overflow Handling”.
Arithmetic operators apply to numbers. For other types of
values, alternative operations may be available. For example, to
add date values, use DATE_ADD()
;
see Section 12.7, “Date and Time Functions”.
Addition:
mysql>
SELECT 3+5;
-> 8Subtraction:
mysql>
SELECT 3-5;
-> -2Unary minus. This operator changes the sign of the operand.
mysql>
SELECT - 2;
-> -2Multiplication:
mysql>
SELECT 3*5;
-> 15 mysql>SELECT 18014398509481984*18014398509481984.0;
-> 324518553658426726783156020576256.0 mysql>SELECT 18014398509481984*18014398509481984;
-> out-of-range errorThe last expression produces an error because the result of the integer multiplication exceeds the 64-bit range of
BIGINT
calculations. (See Section 11.1, “Numeric Data Types”.)Division:
mysql>
SELECT 3/5;
-> 0.60Division by zero produces a
NULL
result:mysql>
SELECT 102/(1-1);
-> NULLA division is calculated with
BIGINT
arithmetic only if performed in a context where its result is converted to an integer.Integer division. Discards from the division result any fractional part to the right of the decimal point.
If either operand has a noninteger type, the operands are converted to
DECIMAL
and divided usingDECIMAL
arithmetic before converting the result toBIGINT
. If the result exceedsBIGINT
range, an error occurs.mysql>
SELECT 5 DIV 2, -5 DIV 2, 5 DIV -2, -5 DIV -2;
-> 2, -2, -2, 2Modulo operation. Returns the remainder of
N
divided byM
. For more information, see the description for theMOD()
function in Section 12.6.2, “Mathematical Functions”.
Table 12.10 Mathematical Functions
Name | Description |
---|---|
ABS() |
Return the absolute value |
ACOS() |
Return the arc cosine |
ASIN() |
Return the arc sine |
ATAN() |
Return the arc tangent |
ATAN2() , ATAN() |
Return the arc tangent of the two arguments |
CEIL() |
Return the smallest integer value not less than the argument |
CEILING() |
Return the smallest integer value not less than the argument |
CONV() |
Convert numbers between different number bases |
COS() |
Return the cosine |
COT() |
Return the cotangent |
CRC32() |
Compute a cyclic redundancy check value |
DEGREES() |
Convert radians to degrees |
EXP() |
Raise to the power of |
FLOOR() |
Return the largest integer value not greater than the argument |
LN() |
Return the natural logarithm of the argument |
LOG() |
Return the natural logarithm of the first argument |
LOG10() |
Return the base-10 logarithm of the argument |
LOG2() |
Return the base-2 logarithm of the argument |
MOD() |
Return the remainder |
PI() |
Return the value of pi |
POW() |
Return the argument raised to the specified power |
POWER() |
Return the argument raised to the specified power |
RADIANS() |
Return argument converted to radians |
RAND() |
Return a random floating-point value |
ROUND() |
Round the argument |
SIGN() |
Return the sign of the argument |
SIN() |
Return the sine of the argument |
SQRT() |
Return the square root of the argument |
TAN() |
Return the tangent of the argument |
TRUNCATE() |
Truncate to specified number of decimal places |
All mathematical functions return NULL
in the
event of an error.
Returns the absolute value of
X
, orNULL
ifX
isNULL
.The result type is derived from the argument type. An implication of this is that
ABS(-9223372036854775808)
produces an error because the result cannot be stored in a signedBIGINT
value.mysql>
SELECT ABS(2);
-> 2 mysql>SELECT ABS(-32);
-> 32This function is safe to use with
BIGINT
values.Returns the arc cosine of
X
, that is, the value whose cosine isX
. ReturnsNULL
ifX
is not in the range-1
to1
.mysql>
SELECT ACOS(1);
-> 0 mysql>SELECT ACOS(1.0001);
-> NULL mysql>SELECT ACOS(0);
-> 1.5707963267949Returns the arc sine of
X
, that is, the value whose sine isX
. ReturnsNULL
ifX
is not in the range-1
to1
.mysql>
SELECT ASIN(0.2);
-> 0.20135792079033 mysql>SELECT ASIN('foo');
+-------------+ | ASIN('foo') | +-------------+ | 0 | +-------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+-----------------------------------------+ | Level | Code | Message | +---------+------+-----------------------------------------+ | Warning | 1292 | Truncated incorrect DOUBLE value: 'foo' | +---------+------+-----------------------------------------+Returns the arc tangent of
X
, that is, the value whose tangent isX
.mysql>
SELECT ATAN(2);
-> 1.1071487177941 mysql>SELECT ATAN(-2);
-> -1.1071487177941Returns the arc tangent of the two variables
X
andY
. It is similar to calculating the arc tangent of
, except that the signs of both arguments are used to determine the quadrant of the result.Y
/X
mysql>
SELECT ATAN(-2,2);
-> -0.78539816339745 mysql>SELECT ATAN2(PI(),0);
-> 1.5707963267949Returns the smallest integer value not less than
X
.mysql>
SELECT CEILING(1.23);
-> 2 mysql>SELECT CEILING(-1.23);
-> -1For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.
Converts numbers between different number bases. Returns a string representation of the number
N
, converted from basefrom_base
to baseto_base
. ReturnsNULL
if any argument isNULL
. The argumentN
is interpreted as an integer, but may be specified as an integer or a string. The minimum base is2
and the maximum base is36
. Iffrom_base
is a negative number,N
is regarded as a signed number. Otherwise,N
is treated as unsigned.CONV()
works with 64-bit precision.mysql>
SELECT CONV('a',16,2);
-> '1010' mysql>SELECT CONV('6E',18,8);
-> '172' mysql>SELECT CONV(-17,10,-18);
-> '-H' mysql>SELECT CONV(10+'10'+'10'+X'0a',10,10);
-> '40'Returns the cosine of
X
, whereX
is given in radians.mysql>
SELECT COS(PI());
-> -1Returns the cotangent of
X
.mysql>
SELECT COT(12);
-> -1.5726734063977 mysql>SELECT COT(0);
-> out-of-range errorComputes a cyclic redundancy check value and returns a 32-bit unsigned value. The result is
NULL
if the argument isNULL
. The argument is expected to be a string and (if possible) is treated as one if it is not.mysql>
SELECT CRC32('MySQL');
-> 3259397556 mysql>SELECT CRC32('mysql');
-> 2501908538Returns the argument
X
, converted from radians to degrees.mysql>
SELECT DEGREES(PI());
-> 180 mysql>SELECT DEGREES(PI() / 2);
-> 90Returns the value of e (the base of natural logarithms) raised to the power of
X
. The inverse of this function isLOG()
(using a single argument only) orLN()
.mysql>
SELECT EXP(2);
-> 7.3890560989307 mysql>SELECT EXP(-2);
-> 0.13533528323661 mysql>SELECT EXP(0);
-> 1Returns the largest integer value not greater than
X
.mysql>
SELECT FLOOR(1.23), FLOOR(-1.23);
-> 1, -2For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.
Formats the number
X
to a format like'#,###,###.##'
, rounded toD
decimal places, and returns the result as a string. For details, see Section 12.8, “String Functions and Operators”.This function can be used to obtain a hexadecimal representation of a decimal number or a string; the manner in which it does so varies according to the argument's type. See this function's description in Section 12.8, “String Functions and Operators”, for details.
Returns the natural logarithm of
X
; that is, the base-e logarithm ofX
. IfX
is less than or equal to 0.0E0, the function returnsNULL
and a warning “Invalid argument for logarithm” is reported.mysql>
SELECT LN(2);
-> 0.69314718055995 mysql>SELECT LN(-2);
-> NULLThis function is synonymous with
LOG(
. The inverse of this function is theX
)EXP()
function.If called with one parameter, this function returns the natural logarithm of
X
. IfX
is less than or equal to 0.0E0, the function returnsNULL
and a warning “Invalid argument for logarithm” is reported.The inverse of this function (when called with a single argument) is the
EXP()
function.mysql>
SELECT LOG(2);
-> 0.69314718055995 mysql>SELECT LOG(-2);
-> NULLIf called with two parameters, this function returns the logarithm of
X
to the baseB
. IfX
is less than or equal to 0, or ifB
is less than or equal to 1, thenNULL
is returned.mysql>
SELECT LOG(2,65536);
-> 16 mysql>SELECT LOG(10,100);
-> 2 mysql>SELECT LOG(1,100);
-> NULLLOG(
is equivalent toB
,X
)LOG(
.X
) / LOG(B
)Returns the base-2 logarithm of
. IfX
X
is less than or equal to 0.0E0, the function returnsNULL
and a warning “Invalid argument for logarithm” is reported.mysql>
SELECT LOG2(65536);
-> 16 mysql>SELECT LOG2(-100);
-> NULLLOG2()
is useful for finding out how many bits a number requires for storage. This function is equivalent to the expressionLOG(
.X
) / LOG(2)Returns the base-10 logarithm of
X
. IfX
is less than or equal to 0.0E0, the function returnsNULL
and a warning “Invalid argument for logarithm” is reported.mysql>
SELECT LOG10(2);
-> 0.30102999566398 mysql>SELECT LOG10(100);
-> 2 mysql>SELECT LOG10(-100);
-> NULLModulo operation. Returns the remainder of
N
divided byM
.mysql>
SELECT MOD(234, 10);
-> 4 mysql>SELECT 253 % 7;
-> 1 mysql>SELECT MOD(29,9);
-> 2 mysql>SELECT 29 MOD 9;
-> 2This function is safe to use with
BIGINT
values.MOD()
also works on values that have a fractional part and returns the exact remainder after division:mysql>
SELECT MOD(34.5,3);
-> 1.5MOD(
returnsN
,0)NULL
.Returns the value of π (pi). The default number of decimal places displayed is seven, but MySQL uses the full double-precision value internally.
mysql>
SELECT PI();
-> 3.141593 mysql>SELECT PI()+0.000000000000000000;
-> 3.141592653589793116Returns the value of
X
raised to the power ofY
.mysql>
SELECT POW(2,2);
-> 4 mysql>SELECT POW(2,-2);
-> 0.25This is a synonym for
POW()
.Returns the argument
X
, converted from degrees to radians. (Note that π radians equals 180 degrees.)mysql>
SELECT RADIANS(90);
-> 1.5707963267949Returns a random floating-point value
v
in the range0
<=v
<1.0
. To obtain a random integerR
in the rangei
<=R
<j
, use the expressionFLOOR(
−i
+ RAND() * (j
. For example, to obtain a random integer in the range the rangei
))7
<=R
<12
, use the following statement:SELECT FLOOR(7 + (RAND() * 5));
If an integer argument
N
is specified, it is used as the seed value:With a constant initializer argument, the seed is initialized once when the statement is prepared, prior to execution.
With a nonconstant initializer argument (such as a column name), the seed is initialized with the value for each invocation of
RAND()
.
One implication of this behavior is that for equal argument values,
RAND(
returns the same value each time, and thus produces a repeatable sequence of column values. In the following example, the sequence of values produced byN
)RAND(3)
is the same both places it occurs.mysql>
CREATE TABLE t (i INT);
Query OK, 0 rows affected (0.42 sec) mysql>INSERT INTO t VALUES(1),(2),(3);
Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT i, RAND() FROM t;
+------+------------------+ | i | RAND() | +------+------------------+ | 1 | 0.61914388706828 | | 2 | 0.93845168309142 | | 3 | 0.83482678498591 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND(3) FROM t;
+------+------------------+ | i | RAND(3) | +------+------------------+ | 1 | 0.90576975597606 | | 2 | 0.37307905813035 | | 3 | 0.14808605345719 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND() FROM t;
+------+------------------+ | i | RAND() | +------+------------------+ | 1 | 0.35877890638893 | | 2 | 0.28941420772058 | | 3 | 0.37073435016976 | +------+------------------+ 3 rows in set (0.00 sec) mysql>SELECT i, RAND(3) FROM t;
+------+------------------+ | i | RAND(3) | +------+------------------+ | 1 | 0.90576975597606 | | 2 | 0.37307905813035 | | 3 | 0.14808605345719 | +------+------------------+ 3 rows in set (0.01 sec)RAND()
in aWHERE
clause is evaluated for every row (when selecting from one table) or combination of rows (when selecting from a multiple-table join). Thus, for optimizer purposes,RAND()
is not a constant value and cannot be used for index optimizations. For more information, see Section 8.2.1.20, “Function Call Optimization”.Use of a column with
RAND()
values in anORDER BY
orGROUP BY
clause may yield unexpected results because for either clause aRAND()
expression can be evaluated multiple times for the same row, each time returning a different result. If the goal is to retrieve rows in random order, you can use a statement like this:SELECT * FROM
tbl_name
ORDER BY RAND();To select a random sample from a set of rows, combine
ORDER BY RAND()
withLIMIT
:SELECT * FROM table1, table2 WHERE a=b AND c<d ORDER BY RAND() LIMIT 1000;
RAND()
is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.This function is unsafe for statement-based replication. A warning is logged if you use this function when
binlog_format
is set toSTATEMENT
.Rounds the argument
X
toD
decimal places. The rounding algorithm depends on the data type ofX
.D
defaults to 0 if not specified.D
can be negative to causeD
digits left of the decimal point of the valueX
to become zero. The maximum absolute value forD
is 30; any digits in excess of 30 (or -30) are truncated.mysql>
SELECT ROUND(-1.23);
-> -1 mysql>SELECT ROUND(-1.58);
-> -2 mysql>SELECT ROUND(1.58);
-> 2 mysql>SELECT ROUND(1.298, 1);
-> 1.3 mysql>SELECT ROUND(1.298, 0);
-> 1 mysql>SELECT ROUND(23.298, -1);
-> 20 mysql>SELECT ROUND(.12345678901234567890123456789012345, 35);
-> 0.123456789012345678901234567890The return value has the same type as the first argument (assuming that it is integer, double, or decimal). This means that for an integer argument, the result is an integer (no decimal places):
mysql>
SELECT ROUND(150.000,2), ROUND(150,2);
+------------------+--------------+ | ROUND(150.000,2) | ROUND(150,2) | +------------------+--------------+ | 150.00 | 150 | +------------------+--------------+ROUND()
uses the following rules depending on the type of the first argument:For exact-value numbers,
ROUND()
uses the “round half away from zero” or “round toward nearest” rule: A value with a fractional part of .5 or greater is rounded up to the next integer if positive or down to the next integer if negative. (In other words, it is rounded away from zero.) A value with a fractional part less than .5 is rounded down to the next integer if positive or up to the next integer if negative.For approximate-value numbers, the result depends on the C library. On many systems, this means that
ROUND()
uses the “round to nearest even” rule: A value with a fractional part exactly halfway between two integers is rounded to the nearest even integer.
The following example shows how rounding differs for exact and approximate values:
mysql>
SELECT ROUND(2.5), ROUND(25E-1);
+------------+--------------+ | ROUND(2.5) | ROUND(25E-1) | +------------+--------------+ | 3 | 2 | +------------+--------------+For more information, see Section 12.25, “Precision Math”.
In MySQL 8.0.21 and later, the data type returned by
ROUND()
(andTRUNCATE()
) is determined according to the rules listed here:When the first argument is of any integer type, the return type is always
BIGINT
.When the first argument is of any floating-point type or of any non-numeric type, the return type is always
DOUBLE
.When the first argument is a
DECIMAL
value, the return type is alsoDECIMAL
.The type attributes for the return value are also copied from the first argument, except in the case of
DECIMAL
, when the second argument is a constant value.When the desired number of decimal places is less than the scale of the argument, the scale and the precision of the result are adjusted accordingly.
In addition, for
ROUND()
(but not for theTRUNCATE()
function), the precision is extended by one place to accomodate rounding that increases the number of significant digits. If the second argument is negative, the return type is adjusted such that its scale is 0, with a corresponding precision. For example,ROUND(99.999, 2)
returns100.00
—the first argument isDECIMAL(5, 3)
, and the return type isDECIMAL(5, 2)
.If the second argument is negative, the return type has scale 0 and a corresponding precision;
ROUND(99.999, -1)
returns100
, which isDECIMAL(3, 0)
.
Returns the sign of the argument as
-1
,0
, or1
, depending on whetherX
is negative, zero, or positive.mysql>
SELECT SIGN(-32);
-> -1 mysql>SELECT SIGN(0);
-> 0 mysql>SELECT SIGN(234);
-> 1Returns the sine of
X
, whereX
is given in radians.mysql>
SELECT SIN(PI());
-> 1.2246063538224e-16 mysql>SELECT ROUND(SIN(PI()));
-> 0Returns the square root of a nonnegative number
X
.mysql>
SELECT SQRT(4);
-> 2 mysql>SELECT SQRT(20);
-> 4.4721359549996 mysql>SELECT SQRT(-16);
-> NULLReturns the tangent of
X
, whereX
is given in radians.mysql>
SELECT TAN(PI());
-> -1.2246063538224e-16 mysql>SELECT TAN(PI()+1);
-> 1.5574077246549Returns the number
X
, truncated toD
decimal places. IfD
is0
, the result has no decimal point or fractional part.D
can be negative to causeD
digits left of the decimal point of the valueX
to become zero.mysql>
SELECT TRUNCATE(1.223,1);
-> 1.2 mysql>SELECT TRUNCATE(1.999,1);
-> 1.9 mysql>SELECT TRUNCATE(1.999,0);
-> 1 mysql>SELECT TRUNCATE(-1.999,1);
-> -1.9 mysql>SELECT TRUNCATE(122,-2);
-> 100 mysql>SELECT TRUNCATE(10.28*100,0);
-> 1028All numbers are rounded toward zero.
In MySQL 8.0.21 and later, the data type returned by
TRUNCATE()
follows the same rules that determine the return type of theROUND()
function; for details, see the description forROUND()
.
This section describes the functions that can be used to manipulate temporal values. See Section 11.2, “Date and Time Data Types”, for a description of the range of values each date and time type has and the valid formats in which values may be specified.
Table 12.11 Date and Time Functions
Name | Description |
---|---|
ADDDATE() |
Add time values (intervals) to a date value |
ADDTIME() |
Add time |
CONVERT_TZ() |
Convert from one time zone to another |
CURDATE() |
Return the current date |
CURRENT_DATE() , CURRENT_DATE |
Synonyms for CURDATE() |
CURRENT_TIME() , CURRENT_TIME |
Synonyms for CURTIME() |
CURRENT_TIMESTAMP() , CURRENT_TIMESTAMP |
Synonyms for NOW() |
CURTIME() |
Return the current time |
DATE() |
Extract the date part of a date or datetime expression |
DATE_ADD() |
Add time values (intervals) to a date value |
DATE_FORMAT() |
Format date as specified |
DATE_SUB() |
Subtract a time value (interval) from a date |
DATEDIFF() |
Subtract two dates |
DAY() |
Synonym for DAYOFMONTH() |
DAYNAME() |
Return the name of the weekday |
DAYOFMONTH() |
Return the day of the month (0-31) |
DAYOFWEEK() |
Return the weekday index of the argument |
DAYOFYEAR() |
Return the day of the year (1-366) |
EXTRACT() |
Extract part of a date |
FROM_DAYS() |
Convert a day number to a date |
FROM_UNIXTIME() |
Format Unix timestamp as a date |
GET_FORMAT() |
Return a date format string |
HOUR() |
Extract the hour |
LAST_DAY |
Return the last day of the month for the argument |
LOCALTIME() , LOCALTIME |
Synonym for NOW() |
LOCALTIMESTAMP , LOCALTIMESTAMP() |
Synonym for NOW() |
MAKEDATE() |
Create a date from the year and day of year |
MAKETIME() |
Create time from hour, minute, second |
MICROSECOND() |
Return the microseconds from argument |
MINUTE() |
Return the minute from the argument |
MONTH() |
Return the month from the date passed |
MONTHNAME() |
Return the name of the month |
NOW() |
Return the current date and time |
PERIOD_ADD() |
Add a period to a year-month |
PERIOD_DIFF() |
Return the number of months between periods |
QUARTER() |
Return the quarter from a date argument |
SEC_TO_TIME() |
Converts seconds to 'hh:mm:ss' format |
SECOND() |
Return the second (0-59) |
STR_TO_DATE() |
Convert a string to a date |
SUBDATE() |
Synonym for DATE_SUB() when invoked with three arguments |
SUBTIME() |
Subtract times |
SYSDATE() |
Return the time at which the function executes |
TIME() |
Extract the time portion of the expression passed |
TIME_FORMAT() |
Format as time |
TIME_TO_SEC() |
Return the argument converted to seconds |
TIMEDIFF() |
Subtract time |
TIMESTAMP() |
With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments |
TIMESTAMPADD() |
Add an interval to a datetime expression |
TIMESTAMPDIFF() |
Subtract an interval from a datetime expression |
TO_DAYS() |
Return the date argument converted to days |
TO_SECONDS() |
Return the date or datetime argument converted to seconds since Year 0 |
UNIX_TIMESTAMP() |
Return a Unix timestamp |
UTC_DATE() |
Return the current UTC date |
UTC_TIME() |
Return the current UTC time |
UTC_TIMESTAMP() |
Return the current UTC date and time |
WEEK() |
Return the week number |
WEEKDAY() |
Return the weekday index |
WEEKOFYEAR() |
Return the calendar week of the date (1-53) |
YEAR() |
Return the year |
YEARWEEK() |
Return the year and week |
Here is an example that uses date functions. The following query
selects all rows with a date_col
value
from within the last 30 days:
mysql>SELECT
->something
FROMtbl_name
WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <=
date_col
;
The query also selects rows with dates that lie in the future.
Functions that expect date values usually accept datetime values and ignore the time part. Functions that expect time values usually accept datetime values and ignore the date part.
Functions that return the current date or time each are evaluated
only once per query at the start of query execution. This means
that multiple references to a function such as
NOW()
within a single query always
produce the same result. (For our purposes, a single query also
includes a call to a stored program (stored routine, trigger, or
event) and all subprograms called by that program.) This principle
also applies to CURDATE()
,
CURTIME()
,
UTC_DATE()
,
UTC_TIME()
,
UTC_TIMESTAMP()
, and to any of
their synonyms.
The CURRENT_TIMESTAMP()
,
CURRENT_TIME()
,
CURRENT_DATE()
, and
FROM_UNIXTIME()
functions return
values in the current session time zone, which is available as the
session value of the time_zone
system variable. In addition,
UNIX_TIMESTAMP()
assumes that its
argument is a datetime value in the session time zone. See
Section 5.1.15, “MySQL Server Time Zone Support”.
Some date functions can be used with “zero” dates or
incomplete dates such as '2001-11-00'
, whereas
others cannot. Functions that extract parts of dates typically
work with incomplete dates and thus can return 0 when you might
otherwise expect a nonzero value. For example:
mysql> SELECT DAYOFMONTH('2001-11-00'), MONTH('2005-00-00');
-> 0, 0
Other functions expect complete dates and return
NULL
for incomplete dates. These include
functions that perform date arithmetic or that map parts of dates
to names. For example:
mysql>SELECT DATE_ADD('2006-05-00',INTERVAL 1 DAY);
-> NULL mysql>SELECT DAYNAME('2006-05-00');
-> NULL
Several functions are strict when passed a
DATE()
function value as their
argument and reject incomplete dates with a day part of zero:
CONVERT_TZ()
,
DATE_ADD()
,
DATE_SUB()
,
DAYOFYEAR()
,
TIMESTAMPDIFF()
,
TO_DAYS()
,
TO_SECONDS()
,
WEEK()
,
WEEKDAY()
,
WEEKOFYEAR()
,
YEARWEEK()
.
Fractional seconds for TIME
,
DATETIME
, and TIMESTAMP
values are supported, with up to microsecond precision. Functions
that take temporal arguments accept values with fractional
seconds. Return values from temporal functions include fractional
seconds as appropriate.
ADDDATE(
,date
,INTERVALexpr
unit
)ADDDATE(
expr
,days
)When invoked with the
INTERVAL
form of the second argument,ADDDATE()
is a synonym forDATE_ADD()
. The related functionSUBDATE()
is a synonym forDATE_SUB()
. For information on theINTERVAL
unit
argument, see Temporal Intervals.mysql>
SELECT DATE_ADD('2008-01-02', INTERVAL 31 DAY);
-> '2008-02-02' mysql>SELECT ADDDATE('2008-01-02', INTERVAL 31 DAY);
-> '2008-02-02'When invoked with the
days
form of the second argument, MySQL treats it as an integer number of days to be added toexpr
.mysql>
SELECT ADDDATE('2008-01-02', 31);
-> '2008-02-02'ADDTIME()
addsexpr2
toexpr1
and returns the result.expr1
is a time or datetime expression, andexpr2
is a time expression.mysql>
SELECT ADDTIME('2007-12-31 23:59:59.999999', '1 1:1:1.000002');
-> '2008-01-02 01:01:01.000001' mysql>SELECT ADDTIME('01:00:00.999999', '02:00:00.999998');
-> '03:00:01.999997'CONVERT_TZ()
converts a datetime valuedt
from the time zone given byfrom_tz
to the time zone given byto_tz
and returns the resulting value. Time zones are specified as described in Section 5.1.15, “MySQL Server Time Zone Support”. This function returnsNULL
if the arguments are invalid.If the value falls out of the supported range of the
TIMESTAMP
type when converted fromfrom_tz
to UTC, no conversion occurs. TheTIMESTAMP
range is described in Section 11.2.1, “Date and Time Data Type Syntax”.mysql>
SELECT CONVERT_TZ('2004-01-01 12:00:00','GMT','MET');
-> '2004-01-01 13:00:00' mysql>SELECT CONVERT_TZ('2004-01-01 12:00:00','+00:00','+10:00');
-> '2004-01-01 22:00:00'NoteTo use named time zones such as
'MET'
or'Europe/Amsterdam'
, the time zone tables must be properly set up. For instructions, see Section 5.1.15, “MySQL Server Time Zone Support”.Returns the current date as a value in
'
orYYYY-MM-DD
'YYYYMMDD
format, depending on whether the function is used in string or numeric context.mysql>
SELECT CURDATE();
-> '2008-06-13' mysql>SELECT CURDATE() + 0;
-> 20080613CURRENT_DATE
andCURRENT_DATE()
are synonyms forCURDATE()
.CURRENT_TIME
,CURRENT_TIME([
fsp
])CURRENT_TIME
andCURRENT_TIME()
are synonyms forCURTIME()
.CURRENT_TIMESTAMP
,CURRENT_TIMESTAMP([
fsp
])CURRENT_TIMESTAMP
andCURRENT_TIMESTAMP()
are synonyms forNOW()
.Returns the current time as a value in
'hh:mm:ss'
orhhmmss
format, depending on whether the function is used in string or numeric context. The value is expressed in the session time zone.If the
fsp
argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.mysql>
SELECT CURTIME();
-> '23:50:26' mysql>SELECT CURTIME() + 0;
-> 235026.000000Extracts the date part of the date or datetime expression
expr
.mysql>
SELECT DATE('2003-12-31 01:02:03');
-> '2003-12-31'DATEDIFF()
returnsexpr1
−expr2
expressed as a value in days from one date to the other.expr1
andexpr2
are date or date-and-time expressions. Only the date parts of the values are used in the calculation.mysql>
SELECT DATEDIFF('2007-12-31 23:59:59','2007-12-30');
-> 1 mysql>SELECT DATEDIFF('2010-11-30 23:59:59','2010-12-31');
-> -31DATE_ADD(
,date
,INTERVALexpr
unit
)DATE_SUB(
date
,INTERVALexpr
unit
)These functions perform date arithmetic. The
date
argument specifies the starting date or datetime value.expr
is an expression specifying the interval value to be added or subtracted from the starting date.expr
is evaluated as a string; it may start with a-
for negative intervals.unit
is a keyword indicating the units in which the expression should be interpreted.For more information about temporal interval syntax, including a full list of
unit
specifiers, the expected form of theexpr
argument for eachunit
value, and rules for operand interpretation in temporal arithmetic, see Temporal Intervals.The return value depends on the arguments:
DATE
if thedate
argument is aDATE
value and your calculations involve onlyYEAR
,MONTH
, andDAY
parts (that is, no time parts).DATETIME
if the first argument is aDATETIME
(orTIMESTAMP
) value, or if the first argument is aDATE
and theunit
value usesHOURS
,MINUTES
, orSECONDS
.String otherwise.
To ensure that the result is
DATETIME
, you can useCAST()
to convert the first argument toDATETIME
.mysql>
SELECT DATE_ADD('2018-05-01',INTERVAL 1 DAY);
-> '2018-05-02' mysql>SELECT DATE_SUB('2018-05-01',INTERVAL 1 YEAR);
-> '2017-05-01' mysql>SELECT DATE_ADD('2020-12-31 23:59:59',
->INTERVAL 1 SECOND);
-> '2021-01-01 00:00:00' mysql>SELECT DATE_ADD('2018-12-31 23:59:59',
->INTERVAL 1 DAY);
-> '2019-01-01 23:59:59' mysql>SELECT DATE_ADD('2100-12-31 23:59:59',
->INTERVAL '1:1' MINUTE_SECOND);
-> '2101-01-01 00:01:00' mysql>SELECT DATE_SUB('2025-01-01 00:00:00',
->INTERVAL '1 1:1:1' DAY_SECOND);
-> '2024-12-30 22:58:59' mysql>SELECT DATE_ADD('1900-01-01 00:00:00',
->INTERVAL '-1 10' DAY_HOUR);
-> '1899-12-30 14:00:00' mysql>SELECT DATE_SUB('1998-01-02', INTERVAL 31 DAY);
-> '1997-12-02' mysql>SELECT DATE_ADD('1992-12-31 23:59:59.000002',
->INTERVAL '1.999999' SECOND_MICROSECOND);
-> '1993-01-01 00:00:01.000001'Formats the
date
value according to theformat
string.The specifiers shown in the following table may be used in the
format
string. The%
character is required before format specifier characters. The specifiers apply to other functions as well:STR_TO_DATE()
,TIME_FORMAT()
,UNIX_TIMESTAMP()
.Specifier Description %a
Abbreviated weekday name ( Sun
..Sat
)%b
Abbreviated month name ( Jan
..Dec
)%c
Month, numeric ( 0
..12
)%D
Day of the month with English suffix ( 0th
,1st
,2nd
,3rd
, …)%d
Day of the month, numeric ( 00
..31
)%e
Day of the month, numeric ( 0
..31
)%f
Microseconds ( 000000
..999999
)%H
Hour ( 00
..23
)%h
Hour ( 01
..12
)%I
Hour ( 01
..12
)%i
Minutes, numeric ( 00
..59
)%j
Day of year ( 001
..366
)%k
Hour ( 0
..23
)%l
Hour ( 1
..12
)%M
Month name ( January
..December
)%m
Month, numeric ( 00
..12
)%p
AM
orPM
%r
Time, 12-hour ( hh:mm:ss
followed byAM
orPM
)%S
Seconds ( 00
..59
)%s
Seconds ( 00
..59
)%T
Time, 24-hour ( hh:mm:ss
)%U
Week ( 00
..53
), where Sunday is the first day of the week;WEEK()
mode 0%u
Week ( 00
..53
), where Monday is the first day of the week;WEEK()
mode 1%V
Week ( 01
..53
), where Sunday is the first day of the week;WEEK()
mode 2; used with%X
%v
Week ( 01
..53
), where Monday is the first day of the week;WEEK()
mode 3; used with%x
%W
Weekday name ( Sunday
..Saturday
)%w
Day of the week ( 0
=Sunday..6
=Saturday)%X
Year for the week where Sunday is the first day of the week, numeric, four digits; used with %V
%x
Year for the week, where Monday is the first day of the week, numeric, four digits; used with %v
%Y
Year, numeric, four digits %y
Year, numeric (two digits) %%
A literal %
character%
x
x
, for any “x
” not listed aboveRanges for the month and day specifiers begin with zero due to the fact that MySQL permits the storing of incomplete dates such as
'2014-00-00'
.The language used for day and month names and abbreviations is controlled by the value of the
lc_time_names
system variable (Section 10.16, “MySQL Server Locale Support”).For the
%U
,%u
,%V
, and%v
specifiers, see the description of theWEEK()
function for information about the mode values. The mode affects how week numbering occurs.DATE_FORMAT()
returns a string with a character set and collation given bycharacter_set_connection
andcollation_connection
so that it can return month and weekday names containing non-ASCII characters.mysql>
SELECT DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y');
-> 'Sunday October 2009' mysql>SELECT DATE_FORMAT('2007-10-04 22:23:00', '%H:%i:%s');
-> '22:23:00' mysql>SELECT DATE_FORMAT('1900-10-04 22:23:00',
->'%D %y %a %d %m %b %j');
-> '4th 00 Thu 04 10 Oct 277' mysql>SELECT DATE_FORMAT('1997-10-04 22:23:00',
->'%H %k %I %r %T %S %w');
-> '22 22 10 10:23:00 PM 22:23:00 00 6' mysql>SELECT DATE_FORMAT('1999-01-01', '%X %V');
-> '1998 52' mysql>SELECT DATE_FORMAT('2006-06-00', '%d');
-> '00'DATE_SUB(
date
,INTERVALexpr
unit
)See the description for
DATE_ADD()
.DAY()
is a synonym forDAYOFMONTH()
.Returns the name of the weekday for
date
. The language used for the name is controlled by the value of thelc_time_names
system variable (Section 10.16, “MySQL Server Locale Support”).mysql>
SELECT DAYNAME('2007-02-03');
-> 'Saturday'Returns the day of the month for
date
, in the range1
to31
, or0
for dates such as'0000-00-00'
or'2008-00-00'
that have a zero day part.mysql>
SELECT DAYOFMONTH('2007-02-03');
-> 3Returns the weekday index for
date
(1
= Sunday,2
= Monday, …,7
= Saturday). These index values correspond to the ODBC standard.mysql>
SELECT DAYOFWEEK('2007-02-03');
-> 7Returns the day of the year for
date
, in the range1
to366
.mysql>
SELECT DAYOFYEAR('2007-02-03');
-> 34The
EXTRACT()
function uses the same kinds ofunit
specifiers asDATE_ADD()
orDATE_SUB()
, but extracts parts from the date rather than performing date arithmetic. For information on theunit
argument, see Temporal Intervals.mysql>
SELECT EXTRACT(YEAR FROM '2019-07-02');
-> 2019 mysql>SELECT EXTRACT(YEAR_MONTH FROM '2019-07-02 01:02:03');
-> 201907 mysql>SELECT EXTRACT(DAY_MINUTE FROM '2019-07-02 01:02:03');
-> 20102 mysql>SELECT EXTRACT(MICROSECOND
->FROM '2003-01-02 10:30:00.000123');
-> 123Given a day number
N
, returns aDATE
value.mysql>
SELECT FROM_DAYS(730669);
-> '2000-07-03'Use
FROM_DAYS()
with caution on old dates. It is not intended for use with values that precede the advent of the Gregorian calendar (1582). See Section 12.9, “What Calendar Is Used By MySQL?”.FROM_UNIXTIME(
unix_timestamp
[,format
])Returns a representation of the
unix_timestamp
argument as a value in'
orYYYY-MM-DD hh:mm:ss
'YYYYMMDDhhmmss
format, depending on whether the function is used in a string or numeric context.unix_timestamp
is an internal timestamp value representing seconds since'1970-01-01 00:00:00'
UTC, such as produced by theUNIX_TIMESTAMP()
function.The return value is expressed in the session time zone. (Clients can set the session time zone as described in Section 5.1.15, “MySQL Server Time Zone Support”.) The
format
string, if given, is used to format the result the same way as described in the entry for theDATE_FORMAT()
function.mysql>
SELECT FROM_UNIXTIME(1447430881);
-> '2015-11-13 10:08:01' mysql>SELECT FROM_UNIXTIME(1447430881) + 0;
-> 20151113100801 mysql>SELECT FROM_UNIXTIME(1447430881,
->'%Y %D %M %h:%i:%s %x');
-> '2015 13th November 10:08:01 2015'NoteIf you use
UNIX_TIMESTAMP()
andFROM_UNIXTIME()
to convert between values in a non-UTC time zone and Unix timestamp values, the conversion is lossy because the mapping is not one-to-one in both directions. For details, see the description of theUNIX_TIMESTAMP()
function.GET_FORMAT({DATE|TIME|DATETIME}, {'EUR'|'USA'|'JIS'|'ISO'|'INTERNAL'})
Returns a format string. This function is useful in combination with the
DATE_FORMAT()
and theSTR_TO_DATE()
functions.The possible values for the first and second arguments result in several possible format strings (for the specifiers used, see the table in the
DATE_FORMAT()
function description). ISO format refers to ISO 9075, not ISO 8601.Function Call Result GET_FORMAT(DATE,'USA')
'%m.%d.%Y'
GET_FORMAT(DATE,'JIS')
'%Y-%m-%d'
GET_FORMAT(DATE,'ISO')
'%Y-%m-%d'
GET_FORMAT(DATE,'EUR')
'%d.%m.%Y'
GET_FORMAT(DATE,'INTERNAL')
'%Y%m%d'
GET_FORMAT(DATETIME,'USA')
'%Y-%m-%d %H.%i.%s'
GET_FORMAT(DATETIME,'JIS')
'%Y-%m-%d %H:%i:%s'
GET_FORMAT(DATETIME,'ISO')
'%Y-%m-%d %H:%i:%s'
GET_FORMAT(DATETIME,'EUR')
'%Y-%m-%d %H.%i.%s'
GET_FORMAT(DATETIME,'INTERNAL')
'%Y%m%d%H%i%s'
GET_FORMAT(TIME,'USA')
'%h:%i:%s %p'
GET_FORMAT(TIME,'JIS')
'%H:%i:%s'
GET_FORMAT(TIME,'ISO')
'%H:%i:%s'
GET_FORMAT(TIME,'EUR')
'%H.%i.%s'
GET_FORMAT(TIME,'INTERNAL')
'%H%i%s'
TIMESTAMP
can also be used as the first argument toGET_FORMAT()
, in which case the function returns the same values as forDATETIME
.mysql>
SELECT DATE_FORMAT('2003-10-03',GET_FORMAT(DATE,'EUR'));
-> '03.10.2003' mysql>SELECT STR_TO_DATE('10.31.2003',GET_FORMAT(DATE,'USA'));
-> '2003-10-31'Returns the hour for
time
. The range of the return value is0
to23
for time-of-day values. However, the range ofTIME
values actually is much larger, soHOUR
can return values greater than23
.mysql>
SELECT HOUR('10:05:03');
-> 10 mysql>SELECT HOUR('272:59:59');
-> 272Takes a date or datetime value and returns the corresponding value for the last day of the month. Returns
NULL
if the argument is invalid.mysql>
SELECT LAST_DAY('2003-02-05');
-> '2003-02-28' mysql>SELECT LAST_DAY('2004-02-05');
-> '2004-02-29' mysql>SELECT LAST_DAY('2004-01-01 01:01:01');
-> '2004-01-31' mysql>SELECT LAST_DAY('2003-03-32');
-> NULLLOCALTIME
andLOCALTIME()
are synonyms forNOW()
.LOCALTIMESTAMP
,LOCALTIMESTAMP([
fsp
])LOCALTIMESTAMP
andLOCALTIMESTAMP()
are synonyms forNOW()
.Returns a date, given year and day-of-year values.
dayofyear
must be greater than 0 or the result isNULL
.mysql>
SELECT MAKEDATE(2011,31), MAKEDATE(2011,32);
-> '2011-01-31', '2011-02-01' mysql>SELECT MAKEDATE(2011,365), MAKEDATE(2014,365);
-> '2011-12-31', '2014-12-31' mysql>SELECT MAKEDATE(2011,0);
-> NULLReturns a time value calculated from the
hour
,minute
, andsecond
arguments.The
second
argument can have a fractional part.mysql>
SELECT MAKETIME(12,15,30);
-> '12:15:30'Returns the microseconds from the time or datetime expression
expr
as a number in the range from0
to999999
.mysql>
SELECT MICROSECOND('12:00:00.123456');
-> 123456 mysql>SELECT MICROSECOND('2019-12-31 23:59:59.000010');
-> 10Returns the minute for
time
, in the range0
to59
.mysql>
SELECT MINUTE('2008-02-03 10:05:03');
-> 5Returns the month for
date
, in the range1
to12
for January to December, or0
for dates such as'0000-00-00'
or'2008-00-00'
that have a zero month part.mysql>
SELECT MONTH('2008-02-03');
-> 2Returns the full name of the month for
date
. The language used for the name is controlled by the value of thelc_time_names
system variable (Section 10.16, “MySQL Server Locale Support”).mysql>
SELECT MONTHNAME('2008-02-03');
-> 'February'Returns the current date and time as a value in
'
orYYYY-MM-DD hh:mm:ss
'YYYYMMDDhhmmss
format, depending on whether the function is used in string or numeric context. The value is expressed in the session time zone.If the
fsp
argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.mysql>
SELECT NOW();
-> '2007-12-15 23:50:26' mysql>SELECT NOW() + 0;
-> 20071215235026.000000NOW()
returns a constant time that indicates the time at which the statement began to execute. (Within a stored function or trigger,NOW()
returns the time at which the function or triggering statement began to execute.) This differs from the behavior forSYSDATE()
, which returns the exact time at which it executes.mysql>
SELECT NOW(), SLEEP(2), NOW();
+---------------------+----------+---------------------+ | NOW() | SLEEP(2) | NOW() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:36 | 0 | 2006-04-12 13:47:36 | +---------------------+----------+---------------------+ mysql>SELECT SYSDATE(), SLEEP(2), SYSDATE();
+---------------------+----------+---------------------+ | SYSDATE() | SLEEP(2) | SYSDATE() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:44 | 0 | 2006-04-12 13:47:46 | +---------------------+----------+---------------------+In addition, the
SET TIMESTAMP
statement affects the value returned byNOW()
but not bySYSDATE()
. This means that timestamp settings in the binary log have no effect on invocations ofSYSDATE()
. Setting the timestamp to a nonzero value causes each subsequent invocation ofNOW()
to return that value. Setting the timestamp to zero cancels this effect so thatNOW()
once again returns the current date and time.See the description for
SYSDATE()
for additional information about the differences between the two functions.Adds
N
months to periodP
(in the formatYYMM
orYYYYMM
). Returns a value in the formatYYYYMM
.NoteThe period argument
P
is not a date value.mysql>
SELECT PERIOD_ADD(200801,2);
-> 200803Returns the number of months between periods
P1
andP2
.P1
andP2
should be in the formatYYMM
orYYYYMM
. Note that the period argumentsP1
andP2
are not date values.mysql>
SELECT PERIOD_DIFF(200802,200703);
-> 11Returns the quarter of the year for
date
, in the range1
to4
.mysql>
SELECT QUARTER('2008-04-01');
-> 2Returns the second for
time
, in the range0
to59
.mysql>
SELECT SECOND('10:05:03');
-> 3Returns the
seconds
argument, converted to hours, minutes, and seconds, as aTIME
value. The range of the result is constrained to that of theTIME
data type. A warning occurs if the argument corresponds to a value outside that range.mysql>
SELECT SEC_TO_TIME(2378);
-> '00:39:38' mysql>SELECT SEC_TO_TIME(2378) + 0;
-> 3938This is the inverse of the
DATE_FORMAT()
function. It takes a stringstr
and a format stringformat
.STR_TO_DATE()
returns aDATETIME
value if the format string contains both date and time parts, or aDATE
orTIME
value if the string contains only date or time parts. If the date, time, or datetime value extracted fromstr
is illegal,STR_TO_DATE()
returnsNULL
and produces a warning.The server scans
str
attempting to matchformat
to it. The format string can contain literal characters and format specifiers beginning with%
. Literal characters informat
must match literally instr
. Format specifiers informat
must match a date or time part instr
. For the specifiers that can be used informat
, see theDATE_FORMAT()
function description.mysql>
SELECT STR_TO_DATE('01,5,2013','%d,%m,%Y');
-> '2013-05-01' mysql>SELECT STR_TO_DATE('May 1, 2013','%M %d,%Y');
-> '2013-05-01'Scanning starts at the beginning of
str
and fails ifformat
is found not to match. Extra characters at the end ofstr
are ignored.mysql>
SELECT STR_TO_DATE('a09:30:17','a%h:%i:%s');
-> '09:30:17' mysql>SELECT STR_TO_DATE('a09:30:17','%h:%i:%s');
-> NULL mysql>SELECT STR_TO_DATE('09:30:17a','%h:%i:%s');
-> '09:30:17'Unspecified date or time parts have a value of 0, so incompletely specified values in
str
produce a result with some or all parts set to 0:mysql>
SELECT STR_TO_DATE('abc','abc');
-> '0000-00-00' mysql>SELECT STR_TO_DATE('9','%m');
-> '0000-09-00' mysql>SELECT STR_TO_DATE('9','%s');
-> '00:00:09'Range checking on the parts of date values is as described in Section 11.2.2, “The DATE, DATETIME, and TIMESTAMP Types”. This means, for example, that “zero” dates or dates with part values of 0 are permitted unless the SQL mode is set to disallow such values.
mysql>
SELECT STR_TO_DATE('00/00/0000', '%m/%d/%Y');
-> '0000-00-00' mysql>SELECT STR_TO_DATE('04/31/2004', '%m/%d/%Y');
-> '2004-04-31'If the
NO_ZERO_DATE
SQL mode is enabled, zero dates are disallowed. In that case,STR_TO_DATE()
returnsNULL
and generates a warning:mysql>
SET sql_mode = '';
mysql>SELECT STR_TO_DATE('00/00/0000', '%m/%d/%Y');
+---------------------------------------+ | STR_TO_DATE('00/00/0000', '%m/%d/%Y') | +---------------------------------------+ | 0000-00-00 | +---------------------------------------+ mysql>SET sql_mode = 'NO_ZERO_DATE';
mysql>SELECT STR_TO_DATE('00/00/0000', '%m/%d/%Y');
+---------------------------------------+ | STR_TO_DATE('00/00/0000', '%m/%d/%Y') | +---------------------------------------+ | NULL | +---------------------------------------+ mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1411 Message: Incorrect datetime value: '00/00/0000' for function str_to_dateNoteYou cannot use format
"%X%V"
to convert a year-week string to a date because the combination of a year and week does not uniquely identify a year and month if the week crosses a month boundary. To convert a year-week to a date, you should also specify the weekday:mysql>
SELECT STR_TO_DATE('200442 Monday', '%X%V %W');
-> '2004-10-18'SUBDATE(
,date
,INTERVALexpr
unit
)SUBDATE(
expr
,days
)When invoked with the
INTERVAL
form of the second argument,SUBDATE()
is a synonym forDATE_SUB()
. For information on theINTERVAL
unit
argument, see the discussion forDATE_ADD()
.mysql>
SELECT DATE_SUB('2008-01-02', INTERVAL 31 DAY);
-> '2007-12-02' mysql>SELECT SUBDATE('2008-01-02', INTERVAL 31 DAY);
-> '2007-12-02'The second form enables the use of an integer value for
days
. In such cases, it is interpreted as the number of days to be subtracted from the date or datetime expressionexpr
.mysql>
SELECT SUBDATE('2008-01-02 12:00:00', 31);
-> '2007-12-02 12:00:00'SUBTIME()
returnsexpr1
−expr2
expressed as a value in the same format asexpr1
.expr1
is a time or datetime expression, andexpr2
is a time expression.mysql>
SELECT SUBTIME('2007-12-31 23:59:59.999999','1 1:1:1.000002');
-> '2007-12-30 22:58:58.999997' mysql>SELECT SUBTIME('01:00:00.999999', '02:00:00.999998');
-> '-00:59:59.999999'Returns the current date and time as a value in
'
orYYYY-MM-DD hh:mm:ss
'YYYYMMDDhhmmss
format, depending on whether the function is used in string or numeric context.If the
fsp
argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.SYSDATE()
returns the time at which it executes. This differs from the behavior forNOW()
, which returns a constant time that indicates the time at which the statement began to execute. (Within a stored function or trigger,NOW()
returns the time at which the function or triggering statement began to execute.)mysql>
SELECT NOW(), SLEEP(2), NOW();
+---------------------+----------+---------------------+ | NOW() | SLEEP(2) | NOW() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:36 | 0 | 2006-04-12 13:47:36 | +---------------------+----------+---------------------+ mysql>SELECT SYSDATE(), SLEEP(2), SYSDATE();
+---------------------+----------+---------------------+ | SYSDATE() | SLEEP(2) | SYSDATE() | +---------------------+----------+---------------------+ | 2006-04-12 13:47:44 | 0 | 2006-04-12 13:47:46 | +---------------------+----------+---------------------+In addition, the
SET TIMESTAMP
statement affects the value returned byNOW()
but not bySYSDATE()
. This means that timestamp settings in the binary log have no effect on invocations ofSYSDATE()
.Because
SYSDATE()
can return different values even within the same statement, and is not affected bySET TIMESTAMP
, it is nondeterministic and therefore unsafe for replication if statement-based binary logging is used. If that is a problem, you can use row-based logging.Alternatively, you can use the
--sysdate-is-now
option to causeSYSDATE()
to be an alias forNOW()
. This works if the option is used on both the replication source server and the replica.The nondeterministic nature of
SYSDATE()
also means that indexes cannot be used for evaluating expressions that refer to it.Extracts the time part of the time or datetime expression
expr
and returns it as a string.This function is unsafe for statement-based replication. A warning is logged if you use this function when
binlog_format
is set toSTATEMENT
.mysql>
SELECT TIME('2003-12-31 01:02:03');
-> '01:02:03' mysql>SELECT TIME('2003-12-31 01:02:03.000123');
-> '01:02:03.000123'TIMEDIFF()
returnsexpr1
−expr2
expressed as a time value.expr1
andexpr2
are time or date-and-time expressions, but both must be of the same type.The result returned by
TIMEDIFF()
is limited to the range allowed forTIME
values. Alternatively, you can use either of the functionsTIMESTAMPDIFF()
andUNIX_TIMESTAMP()
, both of which return integers.mysql>
SELECT TIMEDIFF('2000:01:01 00:00:00',
->'2000:01:01 00:00:00.000001');
-> '-00:00:00.000001' mysql>SELECT TIMEDIFF('2008-12-31 23:59:59.000001',
->'2008-12-30 01:01:01.000002');
-> '46:58:57.999999'TIMESTAMP(
,expr
)TIMESTAMP(
expr1
,expr2
)With a single argument, this function returns the date or datetime expression
expr
as a datetime value. With two arguments, it adds the time expressionexpr2
to the date or datetime expressionexpr1
and returns the result as a datetime value.mysql>
SELECT TIMESTAMP('2003-12-31');
-> '2003-12-31 00:00:00' mysql>SELECT TIMESTAMP('2003-12-31 12:00:00','12:00:00');
-> '2004-01-01 00:00:00'TIMESTAMPADD(
unit
,interval
,datetime_expr
)Adds the integer expression
interval
to the date or datetime expressiondatetime_expr
. The unit forinterval
is given by theunit
argument, which should be one of the following values:MICROSECOND
(microseconds),SECOND
,MINUTE
,HOUR
,DAY
,WEEK
,MONTH
,QUARTER
, orYEAR
.The
unit
value may be specified using one of keywords as shown, or with a prefix ofSQL_TSI_
. For example,DAY
andSQL_TSI_DAY
both are legal.mysql>
SELECT TIMESTAMPADD(MINUTE,1,'2003-01-02');
-> '2003-01-02 00:01:00' mysql>SELECT TIMESTAMPADD(WEEK,1,'2003-01-02');
-> '2003-01-09'TIMESTAMPDIFF(
unit
,datetime_expr1
,datetime_expr2
)Returns
datetime_expr2
−datetime_expr1
, wheredatetime_expr1
anddatetime_expr2
are date or datetime expressions. One expression may be a date and the other a datetime; a date value is treated as a datetime having the time part'00:00:00'
where necessary. The unit for the result (an integer) is given by theunit
argument. The legal values forunit
are the same as those listed in the description of theTIMESTAMPADD()
function.mysql>
SELECT TIMESTAMPDIFF(MONTH,'2003-02-01','2003-05-01');
-> 3 mysql>SELECT TIMESTAMPDIFF(YEAR,'2002-05-01','2001-01-01');
-> -1 mysql>SELECT TIMESTAMPDIFF(MINUTE,'2003-02-01','2003-05-01 12:05:55');
-> 128885NoteThe order of the date or datetime arguments for this function is the opposite of that used with the
TIMESTAMP()
function when invoked with 2 arguments.This is used like the
DATE_FORMAT()
function, but theformat
string may contain format specifiers only for hours, minutes, seconds, and microseconds. Other specifiers produce aNULL
value or0
.If the
time
value contains an hour part that is greater than23
, the%H
and%k
hour format specifiers produce a value larger than the usual range of0..23
. The other hour format specifiers produce the hour value modulo 12.mysql>
SELECT TIME_FORMAT('100:00:00', '%H %k %h %I %l');
-> '100 100 04 04 4'Returns the
time
argument, converted to seconds.mysql>
SELECT TIME_TO_SEC('22:23:00');
-> 80580 mysql>SELECT TIME_TO_SEC('00:39:38');
-> 2378Given a date
date
, returns a day number (the number of days since year 0).mysql>
SELECT TO_DAYS(950501);
-> 728779 mysql>SELECT TO_DAYS('2007-10-07');
-> 733321TO_DAYS()
is not intended for use with values that precede the advent of the Gregorian calendar (1582), because it does not take into account the days that were lost when the calendar was changed. For dates before 1582 (and possibly a later year in other locales), results from this function are not reliable. See Section 12.9, “What Calendar Is Used By MySQL?”, for details.Remember that MySQL converts two-digit year values in dates to four-digit form using the rules in Section 11.2, “Date and Time Data Types”. For example,
'2008-10-07'
and'08-10-07'
are seen as identical dates:mysql>
SELECT TO_DAYS('2008-10-07'), TO_DAYS('08-10-07');
-> 733687, 733687In MySQL, the zero date is defined as
'0000-00-00'
, even though this date is itself considered invalid. This means that, for'0000-00-00'
and'0000-01-01'
,TO_DAYS()
returns the values shown here:mysql>
SELECT TO_DAYS('0000-00-00');
+-----------------------+ | to_days('0000-00-00') | +-----------------------+ | NULL | +-----------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1292 | Incorrect datetime value: '0000-00-00' | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT TO_DAYS('0000-01-01');
+-----------------------+ | to_days('0000-01-01') | +-----------------------+ | 1 | +-----------------------+ 1 row in set (0.00 sec)This is true whether or not the
ALLOW_INVALID_DATES
SQL server mode is enabled.Given a date or datetime
expr
, returns the number of seconds since the year 0. Ifexpr
is not a valid date or datetime value, returnsNULL
.mysql>
SELECT TO_SECONDS(950501);
-> 62966505600 mysql>SELECT TO_SECONDS('2009-11-29');
-> 63426672000 mysql>SELECT TO_SECONDS('2009-11-29 13:43:32');
-> 63426721412 mysql>SELECT TO_SECONDS( NOW() );
-> 63426721458Like
TO_DAYS()
,TO_SECONDS()
is not intended for use with values that precede the advent of the Gregorian calendar (1582), because it does not take into account the days that were lost when the calendar was changed. For dates before 1582 (and possibly a later year in other locales), results from this function are not reliable. See Section 12.9, “What Calendar Is Used By MySQL?”, for details.Like
TO_DAYS()
,TO_SECONDS()
, converts two-digit year values in dates to four-digit form using the rules in Section 11.2, “Date and Time Data Types”.In MySQL, the zero date is defined as
'0000-00-00'
, even though this date is itself considered invalid. This means that, for'0000-00-00'
and'0000-01-01'
,TO_SECONDS()
returns the values shown here:mysql>
SELECT TO_SECONDS('0000-00-00');
+--------------------------+ | TO_SECONDS('0000-00-00') | +--------------------------+ | NULL | +--------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1292 | Incorrect datetime value: '0000-00-00' | +---------+------+----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT TO_SECONDS('0000-01-01');
+--------------------------+ | TO_SECONDS('0000-01-01') | +--------------------------+ | 86400 | +--------------------------+ 1 row in set (0.00 sec)This is true whether or not the
ALLOW_INVALID_DATES
SQL server mode is enabled.If
UNIX_TIMESTAMP()
is called with nodate
argument, it returns a Unix timestamp representing seconds since'1970-01-01 00:00:00'
UTC.If
UNIX_TIMESTAMP()
is called with adate
argument, it returns the value of the argument as seconds since'1970-01-01 00:00:00'
UTC. The server interpretsdate
as a value in the session time zone and converts it to an internal Unix timestamp value in UTC. (Clients can set the session time zone as described in Section 5.1.15, “MySQL Server Time Zone Support”.) Thedate
argument may be aDATE
,DATETIME
, orTIMESTAMP
string, or a number inYYMMDD
,YYMMDDhhmmss
,YYYYMMDD
, orYYYYMMDDhhmmss
format. If the argument includes a time part, it may optionally include a fractional seconds part.The return value is an integer if no argument is given or the argument does not include a fractional seconds part, or
DECIMAL
if an argument is given that includes a fractional seconds part.When the
date
argument is aTIMESTAMP
column,UNIX_TIMESTAMP()
returns the internal timestamp value directly, with no implicit “string-to-Unix-timestamp” conversion.The valid range of argument values is the same as for the
TIMESTAMP
data type:'1970-01-01 00:00:01.000000'
UTC to'2038-01-19 03:14:07.999999'
UTC. If you pass an out-of-range date toUNIX_TIMESTAMP()
, it returns0
.mysql>
SELECT UNIX_TIMESTAMP();
-> 1447431666 mysql>SELECT UNIX_TIMESTAMP('2015-11-13 10:20:19');
-> 1447431619 mysql>SELECT UNIX_TIMESTAMP('2015-11-13 10:20:19.012');
-> 1447431619.012If you use
UNIX_TIMESTAMP()
andFROM_UNIXTIME()
to convert between values in a non-UTC time zone and Unix timestamp values, the conversion is lossy because the mapping is not one-to-one in both directions. For example, due to conventions for local time zone changes such as Daylight Saving Time (DST), it is possible forUNIX_TIMESTAMP()
to map two values that are distinct in a non-UTC time zone to the same Unix timestamp value.FROM_UNIXTIME()
maps that value back to only one of the original values. Here is an example, using values that are distinct in theMET
time zone:mysql>
SET time_zone = 'MET';
mysql>SELECT UNIX_TIMESTAMP('2005-03-27 03:00:00');
+---------------------------------------+ | UNIX_TIMESTAMP('2005-03-27 03:00:00') | +---------------------------------------+ | 1111885200 | +---------------------------------------+ mysql>SELECT UNIX_TIMESTAMP('2005-03-27 02:00:00');
+---------------------------------------+ | UNIX_TIMESTAMP('2005-03-27 02:00:00') | +---------------------------------------+ | 1111885200 | +---------------------------------------+ mysql>SELECT FROM_UNIXTIME(1111885200);
+---------------------------+ | FROM_UNIXTIME(1111885200) | +---------------------------+ | 2005-03-27 03:00:00 | +---------------------------+NoteTo use named time zones such as
'MET'
or'Europe/Amsterdam'
, the time zone tables must be properly set up. For instructions, see Section 5.1.15, “MySQL Server Time Zone Support”.If you want to subtract
UNIX_TIMESTAMP()
columns, you might want to cast them to signed integers. See Section 12.11, “Cast Functions and Operators”.Returns the current UTC date as a value in
'
orYYYY-MM-DD
'YYYYMMDD
format, depending on whether the function is used in string or numeric context.mysql>
SELECT UTC_DATE(), UTC_DATE() + 0;
-> '2003-08-14', 20030814Returns the current UTC time as a value in
'hh:mm:ss'
orhhmmss
format, depending on whether the function is used in string or numeric context.If the
fsp
argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.mysql>
SELECT UTC_TIME(), UTC_TIME() + 0;
-> '18:07:53', 180753.000000UTC_TIMESTAMP
,UTC_TIMESTAMP([
fsp
])Returns the current UTC date and time as a value in
'
orYYYY-MM-DD hh:mm:ss
'YYYYMMDDhhmmss
format, depending on whether the function is used in string or numeric context.If the
fsp
argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.mysql>
SELECT UTC_TIMESTAMP(), UTC_TIMESTAMP() + 0;
-> '2003-08-14 18:08:04', 20030814180804.000000This function returns the week number for
date
. The two-argument form ofWEEK()
enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from0
to53
or from1
to53
. If themode
argument is omitted, the value of thedefault_week_format
system variable is used. See Section 5.1.8, “Server System Variables”.The following table describes how the
mode
argument works.Mode First day of week Range Week 1 is the first week … 0 Sunday 0-53 with a Sunday in this year 1 Monday 0-53 with 4 or more days this year 2 Sunday 1-53 with a Sunday in this year 3 Monday 1-53 with 4 or more days this year 4 Sunday 0-53 with 4 or more days this year 5 Monday 0-53 with a Monday in this year 6 Sunday 1-53 with 4 or more days this year 7 Monday 1-53 with a Monday in this year For
mode
values with a meaning of “with 4 or more days this year,” weeks are numbered according to ISO 8601:1988:If the week containing January 1 has 4 or more days in the new year, it is week 1.
Otherwise, it is the last week of the previous year, and the next week is week 1.
mysql>
SELECT WEEK('2008-02-20');
-> 7 mysql>SELECT WEEK('2008-02-20',0);
-> 7 mysql>SELECT WEEK('2008-02-20',1);
-> 8 mysql>SELECT WEEK('2008-12-31',1);
-> 53If a date falls in the last week of the previous year, MySQL returns
0
if you do not use2
,3
,6
, or7
as the optionalmode
argument:mysql>
SELECT YEAR('2000-01-01'), WEEK('2000-01-01',0);
-> 2000, 0One might argue that
WEEK()
should return52
because the given date actually occurs in the 52nd week of 1999.WEEK()
returns0
instead so that the return value is “the week number in the given year.” This makes use of theWEEK()
function reliable when combined with other functions that extract a date part from a date.If you prefer a result evaluated with respect to the year that contains the first day of the week for the given date, use
0
,2
,5
, or7
as the optionalmode
argument.mysql>
SELECT WEEK('2000-01-01',2);
-> 52Alternatively, use the
YEARWEEK()
function:mysql>
SELECT YEARWEEK('2000-01-01');
-> 199952 mysql>SELECT MID(YEARWEEK('2000-01-01'),5,2);
-> '52'Returns the weekday index for
date
(0
= Monday,1
= Tuesday, …6
= Sunday).mysql>
SELECT WEEKDAY('2008-02-03 22:23:00');
-> 6 mysql>SELECT WEEKDAY('2007-11-06');
-> 1Returns the calendar week of the date as a number in the range from
1
to53
.WEEKOFYEAR()
is a compatibility function that is equivalent toWEEK(
.date
,3)mysql>
SELECT WEEKOFYEAR('2008-02-20');
-> 8Returns the year for
date
, in the range1000
to9999
, or0
for the “zero” date.mysql>
SELECT YEAR('1987-01-01');
-> 1987YEARWEEK(
,date
)YEARWEEK(
date
,mode
)Returns year and week for a date. The year in the result may be different from the year in the date argument for the first and the last week of the year.
The
mode
argument works exactly like themode
argument toWEEK()
. For the single-argument syntax, amode
value of 0 is used. UnlikeWEEK()
, the value ofdefault_week_format
does not influenceYEARWEEK()
.mysql>
SELECT YEARWEEK('1987-01-01');
-> 198652The week number is different from what the
WEEK()
function would return (0
) for optional arguments0
or1
, asWEEK()
then returns the week in the context of the given year.
Table 12.12 String Functions and Operators
Name | Description |
---|---|
ASCII() |
Return numeric value of left-most character |
BIN() |
Return a string containing binary representation of a number |
BIT_LENGTH() |
Return length of argument in bits |
CHAR() |
Return the character for each integer passed |
CHAR_LENGTH() |
Return number of characters in argument |
CHARACTER_LENGTH() |
Synonym for CHAR_LENGTH() |
CONCAT() |
Return concatenated string |
CONCAT_WS() |
Return concatenate with separator |
ELT() |
Return string at index number |
EXPORT_SET() |
Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string |
FIELD() |
Index (position) of first argument in subsequent arguments |
FIND_IN_SET() |
Index (position) of first argument within second argument |
FORMAT() |
Return a number formatted to specified number of decimal places |
FROM_BASE64() |
Decode base64 encoded string and return result |
HEX() |
Hexadecimal representation of decimal or string value |
INSERT() |
Insert substring at specified position up to specified number of characters |
INSTR() |
Return the index of the first occurrence of substring |
LCASE() |
Synonym for LOWER() |
LEFT() |
Return the leftmost number of characters as specified |
LENGTH() |
Return the length of a string in bytes |
LIKE |
Simple pattern matching |
LOAD_FILE() |
Load the named file |
LOCATE() |
Return the position of the first occurrence of substring |
LOWER() |
Return the argument in lowercase |
LPAD() |
Return the string argument, left-padded with the specified string |
LTRIM() |
Remove leading spaces |
MAKE_SET() |
Return a set of comma-separated strings that have the corresponding bit in bits set |
MATCH |
Perform full-text search |
MID() |
Return a substring starting from the specified position |
NOT LIKE |
Negation of simple pattern matching |
NOT REGEXP |
Negation of REGEXP |
OCT() |
Return a string containing octal representation of a number |
OCTET_LENGTH() |
Synonym for LENGTH() |
ORD() |
Return character code for leftmost character of the argument |
POSITION() |
Synonym for LOCATE() |
QUOTE() |
Escape the argument for use in an SQL statement |
REGEXP |
Whether string matches regular expression |
REGEXP_INSTR() |
Starting index of substring matching regular expression |
REGEXP_LIKE() |
Whether string matches regular expression |
REGEXP_REPLACE() |
Replace substrings matching regular expression |
REGEXP_SUBSTR() |
Return substring matching regular expression |
REPEAT() |
Repeat a string the specified number of times |
REPLACE() |
Replace occurrences of a specified string |
REVERSE() |
Reverse the characters in a string |
RIGHT() |
Return the specified rightmost number of characters |
RLIKE |
Whether string matches regular expression |
RPAD() |
Append string the specified number of times |
RTRIM() |
Remove trailing spaces |
SOUNDEX() |
Return a soundex string |
SOUNDS LIKE |
Compare sounds |
SPACE() |
Return a string of the specified number of spaces |
STRCMP() |
Compare two strings |
SUBSTR() |
Return the substring as specified |
SUBSTRING() |
Return the substring as specified |
SUBSTRING_INDEX() |
Return a substring from a string before the specified number of occurrences of the delimiter |
TO_BASE64() |
Return the argument converted to a base-64 string |
TRIM() |
Remove leading and trailing spaces |
UCASE() |
Synonym for UPPER() |
UNHEX() |
Return a string containing hex representation of a number |
UPPER() |
Convert to uppercase |
WEIGHT_STRING() |
Return the weight string for a string |
String-valued functions return NULL
if the
length of the result would be greater than the value of the
max_allowed_packet
system
variable. See Section 5.1.1, “Configuring the Server”.
For functions that operate on string positions, the first position is numbered 1.
For functions that take length arguments, noninteger arguments are rounded to the nearest integer.
Returns the numeric value of the leftmost character of the string
str
. Returns0
ifstr
is the empty string. ReturnsNULL
ifstr
isNULL
.ASCII()
works for 8-bit characters.mysql>
SELECT ASCII('2');
-> 50 mysql>SELECT ASCII(2);
-> 50 mysql>SELECT ASCII('dx');
-> 100See also the
ORD()
function.Returns a string representation of the binary value of
N
, whereN
is a longlong (BIGINT
) number. This is equivalent toCONV(
. ReturnsN
,10,2)NULL
ifN
isNULL
.mysql>
SELECT BIN(12);
-> '1100'Returns the length of the string
str
in bits.mysql>
SELECT BIT_LENGTH('text');
-> 32CHAR(
N
,... [USINGcharset_name
])CHAR()
interprets each argumentN
as an integer and returns a string consisting of the characters given by the code values of those integers.NULL
values are skipped.mysql>
SELECT CHAR(77,121,83,81,'76');
-> 'MySQL' mysql>SELECT CHAR(77,77.3,'77.3');
-> 'MMM'CHAR()
arguments larger than 255 are converted into multiple result bytes. For example,CHAR(256)
is equivalent toCHAR(1,0)
, andCHAR(256*256)
is equivalent toCHAR(1,0,0)
:mysql>
SELECT HEX(CHAR(1,0)), HEX(CHAR(256));
+----------------+----------------+ | HEX(CHAR(1,0)) | HEX(CHAR(256)) | +----------------+----------------+ | 0100 | 0100 | +----------------+----------------+ mysql>SELECT HEX(CHAR(1,0,0)), HEX(CHAR(256*256));
+------------------+--------------------+ | HEX(CHAR(1,0,0)) | HEX(CHAR(256*256)) | +------------------+--------------------+ | 010000 | 010000 | +------------------+--------------------+By default,
CHAR()
returns a binary string. To produce a string in a given character set, use the optionalUSING
clause:mysql>
SELECT CHARSET(CHAR(X'65')), CHARSET(CHAR(X'65' USING utf8));
+----------------------+---------------------------------+ | CHARSET(CHAR(X'65')) | CHARSET(CHAR(X'65' USING utf8)) | +----------------------+---------------------------------+ | binary | utf8 | +----------------------+---------------------------------+If
USING
is given and the result string is illegal for the given character set, a warning is issued. Also, if strict SQL mode is enabled, the result fromCHAR()
becomesNULL
.Returns the length of the string
str
, measured in characters. A multibyte character counts as a single character. This means that for a string containing five 2-byte characters,LENGTH()
returns10
, whereasCHAR_LENGTH()
returns5
.CHARACTER_LENGTH()
is a synonym forCHAR_LENGTH()
.Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are nonbinary strings, the result is a nonbinary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent nonbinary string form.
CONCAT()
returnsNULL
if any argument isNULL
.mysql>
SELECT CONCAT('My', 'S', 'QL');
-> 'MySQL' mysql>SELECT CONCAT('My', NULL, 'QL');
-> NULL mysql>SELECT CONCAT(14.3);
-> '14.3'For quoted strings, concatenation can be performed by placing the strings next to each other:
mysql>
SELECT 'My' 'S' 'QL';
-> 'MySQL'CONCAT_WS(
separator
,str1
,str2
,...)CONCAT_WS()
stands for Concatenate With Separator and is a special form ofCONCAT()
. The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments. If the separator isNULL
, the result isNULL
.mysql>
SELECT CONCAT_WS(',','First name','Second name','Last Name');
-> 'First name,Second name,Last Name' mysql>SELECT CONCAT_WS(',','First name',NULL,'Last Name');
-> 'First name,Last Name'CONCAT_WS()
does not skip empty strings. However, it does skip anyNULL
values after the separator argument.ELT()
returns theN
th element of the list of strings:str1
ifN
=1
,str2
ifN
=2
, and so on. ReturnsNULL
ifN
is less than1
or greater than the number of arguments.ELT()
is the complement ofFIELD()
.mysql>
SELECT ELT(1, 'Aa', 'Bb', 'Cc', 'Dd');
-> 'Aa' mysql>SELECT ELT(4, 'Aa', 'Bb', 'Cc', 'Dd');
-> 'Dd'EXPORT_SET(
bits
,on
,off
[,separator
[,number_of_bits
]])Returns a string such that for every bit set in the value
bits
, you get anon
string and for every bit not set in the value, you get anoff
string. Bits inbits
are examined from right to left (from low-order to high-order bits). Strings are added to the result from left to right, separated by theseparator
string (the default being the comma character,
). The number of bits examined is given bynumber_of_bits
, which has a default of 64 if not specified.number_of_bits
is silently clipped to 64 if larger than 64. It is treated as an unsigned integer, so a value of −1 is effectively the same as 64.mysql>
SELECT EXPORT_SET(5,'Y','N',',',4);
-> 'Y,N,Y,N' mysql>SELECT EXPORT_SET(6,'1','0',',',10);
-> '0,1,1,0,0,0,0,0,0,0'Returns the index (position) of
str
in thestr1
,str2
,str3
,...
list. Returns0
ifstr
is not found.If all arguments to
FIELD()
are strings, all arguments are compared as strings. If all arguments are numbers, they are compared as numbers. Otherwise, the arguments are compared as double.If
str
isNULL
, the return value is0
becauseNULL
fails equality comparison with any value.FIELD()
is the complement ofELT()
.mysql>
SELECT FIELD('Bb', 'Aa', 'Bb', 'Cc', 'Dd', 'Ff');
-> 2 mysql>SELECT FIELD('Gg', 'Aa', 'Bb', 'Cc', 'Dd', 'Ff');
-> 0Returns a value in the range of 1 to
N
if the stringstr
is in the string liststrlist
consisting ofN
substrings. A string list is a string composed of substrings separated by,
characters. If the first argument is a constant string and the second is a column of typeSET
, theFIND_IN_SET()
function is optimized to use bit arithmetic. Returns0
ifstr
is not instrlist
or ifstrlist
is the empty string. ReturnsNULL
if either argument isNULL
. This function does not work properly if the first argument contains a comma (,
) character.mysql>
SELECT FIND_IN_SET('b','a,b,c,d');
-> 2Formats the number
X
to a format like'#,###,###.##'
, rounded toD
decimal places, and returns the result as a string. IfD
is0
, the result has no decimal point or fractional part.The optional third parameter enables a locale to be specified to be used for the result number's decimal point, thousands separator, and grouping between separators. Permissible locale values are the same as the legal values for the
lc_time_names
system variable (see Section 10.16, “MySQL Server Locale Support”). If no locale is specified, the default is'en_US'
.mysql>
SELECT FORMAT(12332.123456, 4);
-> '12,332.1235' mysql>SELECT FORMAT(12332.1,4);
-> '12,332.1000' mysql>SELECT FORMAT(12332.2,0);
-> '12,332' mysql>SELECT FORMAT(12332.2,2,'de_DE');
-> '12.332,20'Takes a string encoded with the base-64 encoded rules used by
TO_BASE64()
and returns the decoded result as a binary string. The result isNULL
if the argument isNULL
or not a valid base-64 string. See the description ofTO_BASE64()
for details about the encoding and decoding rules.mysql>
SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
-> 'JWJj', 'abc'For a string argument
str
,HEX()
returns a hexadecimal string representation ofstr
where each byte of each character instr
is converted to two hexadecimal digits. (Multibyte characters therefore become more than two digits.) The inverse of this operation is performed by theUNHEX()
function.For a numeric argument
N
,HEX()
returns a hexadecimal string representation of the value ofN
treated as a longlong (BIGINT
) number. This is equivalent toCONV(
. The inverse of this operation is performed byN
,10,16)CONV(HEX(
.N
),16,10)mysql>
SELECT X'616263', HEX('abc'), UNHEX(HEX('abc'));
-> 'abc', 616263, 'abc' mysql>SELECT HEX(255), CONV(HEX(255),16,10);
-> 'FF', 255Returns the string
str
, with the substring beginning at positionpos
andlen
characters long replaced by the stringnewstr
. Returns the original string ifpos
is not within the length of the string. Replaces the rest of the string from positionpos
iflen
is not within the length of the rest of the string. ReturnsNULL
if any argument isNULL
.mysql>
SELECT INSERT('Quadratic', 3, 4, 'What');
-> 'QuWhattic' mysql>SELECT INSERT('Quadratic', -1, 4, 'What');
-> 'Quadratic' mysql>SELECT INSERT('Quadratic', 3, 100, 'What');
-> 'QuWhat'This function is multibyte safe.
Returns the position of the first occurrence of substring
substr
in stringstr
. This is the same as the two-argument form ofLOCATE()
, except that the order of the arguments is reversed.mysql>
SELECT INSTR('foobarbar', 'bar');
-> 4 mysql>SELECT INSTR('xbar', 'foobar');
-> 0This function is multibyte safe, and is case-sensitive only if at least one argument is a binary string.
LCASE()
is a synonym forLOWER()
.LCASE()
used in a view is rewritten asLOWER()
when storing the view's definition. (Bug #12844279)Returns the leftmost
len
characters from the stringstr
, orNULL
if any argument isNULL
.mysql>
SELECT LEFT('foobarbar', 5);
-> 'fooba'This function is multibyte safe.
Returns the length of the string
str
, measured in bytes. A multibyte character counts as multiple bytes. This means that for a string containing five 2-byte characters,LENGTH()
returns10
, whereasCHAR_LENGTH()
returns5
.mysql>
SELECT LENGTH('text');
-> 4NoteThe
Length()
OpenGIS spatial function is namedST_Length()
in MySQL.Reads the file and returns the file contents as a string. To use this function, the file must be located on the server host, you must specify the full path name to the file, and you must have the
FILE
privilege. The file must be readable by the server and its size less thanmax_allowed_packet
bytes. If thesecure_file_priv
system variable is set to a nonempty directory name, the file to be loaded must be located in that directory. (Prior to MySQL 8.0.17, the file must be readable by all, not just readable by the server.)If the file does not exist or cannot be read because one of the preceding conditions is not satisfied, the function returns
NULL
.The
character_set_filesystem
system variable controls interpretation of file names that are given as literal strings.mysql>
UPDATE t
SET blob_col=LOAD_FILE('/tmp/picture')
WHERE id=1;
LOCATE(
,substr
,str
)LOCATE(
substr
,str
,pos
)The first syntax returns the position of the first occurrence of substring
substr
in stringstr
. The second syntax returns the position of the first occurrence of substringsubstr
in stringstr
, starting at positionpos
. Returns0
ifsubstr
is not instr
. ReturnsNULL
if any argument isNULL
.mysql>
SELECT LOCATE('bar', 'foobarbar');
-> 4 mysql>SELECT LOCATE('xbar', 'foobar');
-> 0 mysql>SELECT LOCATE('bar', 'foobarbar', 5);
-> 7This function is multibyte safe, and is case-sensitive only if at least one argument is a binary string.
Returns the string
str
with all characters changed to lowercase according to the current character set mapping. The default isutf8mb4
.mysql>
SELECT LOWER('QUADRATICALLY');
-> 'quadratically'LOWER()
(andUPPER()
) are ineffective when applied to binary strings (BINARY
,VARBINARY
,BLOB
). To perform lettercase conversion of a binary string, first convert it to a nonbinary string using a character set appropriate for the data stored in the string:mysql>
SET @str = BINARY 'New York';
mysql>SELECT LOWER(@str), LOWER(CONVERT(@str USING utf8mb4));
+-------------+------------------------------------+ | LOWER(@str) | LOWER(CONVERT(@str USING utf8mb4)) | +-------------+------------------------------------+ | New York | new york | +-------------+------------------------------------+For collations of Unicode character sets,
LOWER()
andUPPER()
work according to the Unicode Collation Algorithm (UCA) version in the collation name, if there is one, and UCA 4.0.0 if no version is specified. For example,utf8mb4_0900_ai_ci
andutf8_unicode_520_ci
work according to UCA 9.0.0 and 5.2.0, respectively, whereasutf8_unicode_ci
works according to UCA 4.0.0. See Section 10.10.1, “Unicode Character Sets”.This function is multibyte safe.
LCASE()
used within views is rewritten asLOWER()
.Returns the string
str
, left-padded with the stringpadstr
to a length oflen
characters. Ifstr
is longer thanlen
, the return value is shortened tolen
characters.mysql>
SELECT LPAD('hi',4,'??');
-> '??hi' mysql>SELECT LPAD('hi',1,'??');
-> 'h'Returns the string
str
with leading space characters removed.mysql>
SELECT LTRIM(' barbar');
-> 'barbar'This function is multibyte safe.
Returns a set value (a string containing substrings separated by
,
characters) consisting of the strings that have the corresponding bit inbits
set.str1
corresponds to bit 0,str2
to bit 1, and so on.NULL
values instr1
,str2
,...
are not appended to the result.mysql>
SELECT MAKE_SET(1,'a','b','c');
-> 'a' mysql>SELECT MAKE_SET(1 | 4,'hello','nice','world');
-> 'hello,world' mysql>SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
-> 'hello' mysql>SELECT MAKE_SET(0,'a','b','c');
-> ''MID(
is a synonym forstr
,pos
,len
)SUBSTRING(
.str
,pos
,len
)Returns a string representation of the octal value of
N
, whereN
is a longlong (BIGINT
) number. This is equivalent toCONV(
. ReturnsN
,10,8)NULL
ifN
isNULL
.mysql>
SELECT OCT(12);
-> '14'OCTET_LENGTH()
is a synonym forLENGTH()
.If the leftmost character of the string
str
is a multibyte character, returns the code for that character, calculated from the numeric values of its constituent bytes using this formula:(1st byte code) + (2nd byte code * 256) + (3rd byte code * 256^2) ...
If the leftmost character is not a multibyte character,
ORD()
returns the same value as theASCII()
function.mysql>
SELECT ORD('2');
-> 50POSITION(
is a synonym forsubstr
INstr
)LOCATE(
.substr
,str
)Quotes a string to produce a result that can be used as a properly escaped data value in an SQL statement. The string is returned enclosed by single quotation marks and with each instance of backslash (
\
), single quote ('
), ASCIINUL
, and Control+Z preceded by a backslash. If the argument isNULL
, the return value is the word “NULL” without enclosing single quotation marks.mysql>
SELECT QUOTE('Don\'t!');
-> 'Don\'t!' mysql>SELECT QUOTE(NULL);
-> NULLFor comparison, see the quoting rules for literal strings and within the C API in Section 9.1.1, “String Literals”, and mysql_real_escape_string_quote().
Returns a string consisting of the string
str
repeatedcount
times. Ifcount
is less than 1, returns an empty string. ReturnsNULL
ifstr
orcount
areNULL
.mysql>
SELECT REPEAT('MySQL', 3);
-> 'MySQLMySQLMySQL'Returns the string
str
with all occurrences of the stringfrom_str
replaced by the stringto_str
.REPLACE()
performs a case-sensitive match when searching forfrom_str
.mysql>
SELECT REPLACE('www.mysql.com', 'w', 'Ww');
-> 'WwWwWw.mysql.com'This function is multibyte safe.
Returns the string
str
with the order of the characters reversed.mysql>
SELECT REVERSE('abc');
-> 'cba'This function is multibyte safe.
Returns the rightmost
len
characters from the stringstr
, orNULL
if any argument isNULL
.mysql>
SELECT RIGHT('foobarbar', 4);
-> 'rbar'This function is multibyte safe.
Returns the string
str
, right-padded with the stringpadstr
to a length oflen
characters. Ifstr
is longer thanlen
, the return value is shortened tolen
characters.mysql>
SELECT RPAD('hi',5,'?');
-> 'hi???' mysql>SELECT RPAD('hi',1,'?');
-> 'h'This function is multibyte safe.
Returns the string
str
with trailing space characters removed.mysql>
SELECT RTRIM('barbar ');
-> 'barbar'This function is multibyte safe.
Returns a soundex string from
str
. Two strings that sound almost the same should have identical soundex strings. A standard soundex string is four characters long, but theSOUNDEX()
function returns an arbitrarily long string. You can useSUBSTRING()
on the result to get a standard soundex string. All nonalphabetic characters instr
are ignored. All international alphabetic characters outside the A-Z range are treated as vowels.ImportantWhen using
SOUNDEX()
, you should be aware of the following limitations:This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other languages may not produce reliable results.
This function is not guaranteed to provide consistent results with strings that use multibyte character sets, including
utf-8
. See Bug #22638 for more information.
mysql>
SELECT SOUNDEX('Hello');
-> 'H400' mysql>SELECT SOUNDEX('Quadratically');
-> 'Q36324'NoteThis function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.
This is the same as
SOUNDEX(
.expr1
) = SOUNDEX(expr2
)Returns a string consisting of
N
space characters.mysql>
SELECT SPACE(6);
-> ' 'SUBSTR(
,str
,pos
)SUBSTR(
,str
FROMpos
)SUBSTR(
,str
,pos
,len
)SUBSTR(
str
FROMpos
FORlen
)SUBSTR()
is a synonym forSUBSTRING()
.SUBSTRING(
,str
,pos
)SUBSTRING(
,str
FROMpos
)SUBSTRING(
,str
,pos
,len
)SUBSTRING(
str
FROMpos
FORlen
)The forms without a
len
argument return a substring from stringstr
starting at positionpos
. The forms with alen
argument return a substringlen
characters long from stringstr
, starting at positionpos
. The forms that useFROM
are standard SQL syntax. It is also possible to use a negative value forpos
. In this case, the beginning of the substring ispos
characters from the end of the string, rather than the beginning. A negative value may be used forpos
in any of the forms of this function. A value of 0 forpos
returns an empty string.For all forms of
SUBSTRING()
, the position of the first character in the string from which the substring is to be extracted is reckoned as1
.mysql>
SELECT SUBSTRING('Quadratically',5);
-> 'ratically' mysql>SELECT SUBSTRING('foobarbar' FROM 4);
-> 'barbar' mysql>SELECT SUBSTRING('Quadratically',5,6);
-> 'ratica' mysql>SELECT SUBSTRING('Sakila', -3);
-> 'ila' mysql>SELECT SUBSTRING('Sakila', -5, 3);
-> 'aki' mysql>SELECT SUBSTRING('Sakila' FROM -4 FOR 2);
-> 'ki'This function is multibyte safe.
If
len
is less than 1, the result is the empty string.SUBSTRING_INDEX(
str
,delim
,count
)Returns the substring from string
str
beforecount
occurrences of the delimiterdelim
. Ifcount
is positive, everything to the left of the final delimiter (counting from the left) is returned. Ifcount
is negative, everything to the right of the final delimiter (counting from the right) is returned.SUBSTRING_INDEX()
performs a case-sensitive match when searching fordelim
.mysql>
SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
-> 'www.mysql' mysql>SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
-> 'mysql.com'This function is multibyte safe.
Converts the string argument to base-64 encoded form and returns the result as a character string with the connection character set and collation. If the argument is not a string, it is converted to a string before conversion takes place. The result is
NULL
if the argument isNULL
. Base-64 encoded strings can be decoded using theFROM_BASE64()
function.mysql>
SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc'));
-> 'JWJj', 'abc'Different base-64 encoding schemes exist. These are the encoding and decoding rules used by
TO_BASE64()
andFROM_BASE64()
:The encoding for alphabet value 62 is
'+'
.The encoding for alphabet value 63 is
'/'
.Encoded output consists of groups of 4 printable characters. Each 3 bytes of the input data are encoded using 4 characters. If the last group is incomplete, it is padded with
'='
characters to a length of 4.A newline is added after each 76 characters of encoded output to divide long output into multiple lines.
Decoding recognizes and ignores newline, carriage return, tab, and space.
TRIM([{BOTH | LEADING | TRAILING} [
,remstr
] FROM]str
)TRIM([
remstr
FROM]str
)Returns the string
str
with allremstr
prefixes or suffixes removed. If none of the specifiersBOTH
,LEADING
, orTRAILING
is given,BOTH
is assumed.remstr
is optional and, if not specified, spaces are removed.mysql>
SELECT TRIM(' bar ');
-> 'bar' mysql>SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx');
-> 'barxxx' mysql>SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx');
-> 'bar' mysql>SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz');
-> 'barx'This function is multibyte safe.
UCASE()
is a synonym forUPPER()
.UCASE()
used within views is rewritten asUPPER()
.For a string argument
str
,UNHEX(
interprets each pair of characters in the argument as a hexadecimal number and converts it to the byte represented by the number. The return value is a binary string.str
)mysql>
SELECT UNHEX('4D7953514C');
-> 'MySQL' mysql>SELECT X'4D7953514C';
-> 'MySQL' mysql>SELECT UNHEX(HEX('string'));
-> 'string' mysql>SELECT HEX(UNHEX('1267'));
-> '1267'The characters in the argument string must be legal hexadecimal digits:
'0'
..'9'
,'A'
..'F'
,'a'
..'f'
. If the argument contains any nonhexadecimal digits, the result isNULL
:mysql>
SELECT UNHEX('GG');
+-------------+ | UNHEX('GG') | +-------------+ | NULL | +-------------+A
NULL
result can occur if the argument toUNHEX()
is aBINARY
column, because values are padded with0x00
bytes when stored but those bytes are not stripped on retrieval. For example,'41'
is stored into aCHAR(3)
column as'41 '
and retrieved as'41'
(with the trailing pad space stripped), soUNHEX()
for the column value returnsX'41'
. By contrast,'41'
is stored into aBINARY(3)
column as'41\0'
and retrieved as'41\0'
(with the trailing pad0x00
byte not stripped).'\0'
is not a legal hexadecimal digit, soUNHEX()
for the column value returnsNULL
.For a numeric argument
N
, the inverse ofHEX(
is not performed byN
)UNHEX()
. UseCONV(HEX(
instead. See the description ofN
),16,10)HEX()
.Returns the string
str
with all characters changed to uppercase according to the current character set mapping. The default isutf8mb4
.mysql>
SELECT UPPER('Hej');
-> 'HEJ'See the description of
LOWER()
for information that also applies toUPPER()
. This included information about how to perform lettercase conversion of binary strings (BINARY
,VARBINARY
,BLOB
) for which these functions are ineffective, and information about case folding for Unicode character sets.This function is multibyte safe.
UCASE()
used within views is rewritten asUPPER()
.WEIGHT_STRING(
str
[AS {CHAR|BINARY}(N
)] [flags
])This function returns the weight string for the input string. The return value is a binary string that represents the comparison and sorting value of the string. It has these properties:
If
WEIGHT_STRING(
=str1
)WEIGHT_STRING(
, thenstr2
)
(str1
=str2
str1
andstr2
are considered equal)If
WEIGHT_STRING(
<str1
)WEIGHT_STRING(
, thenstr2
)
(str1
<str2
str1
sorts beforestr2
)
WEIGHT_STRING()
is a debugging function intended for internal use. Its behavior can change without notice between MySQL versions. It can be used for testing and debugging of collations, especially if you are adding a new collation. See Section 10.14, “Adding a Collation to a Character Set”.This list briefly summarizes the arguments. More details are given in the discussion following the list.
str
: The input string expression.AS
clause: Optional; cast the input string to a given type and length.flags
: Optional; unused.
The input string,
str
, is a string expression. If the input is a nonbinary (character) string such as aCHAR
,VARCHAR
, orTEXT
value, the return value contains the collation weights for the string. If the input is a binary (byte) string such as aBINARY
,VARBINARY
, orBLOB
value, the return value is the same as the input (the weight for each byte in a binary string is the byte value). If the input isNULL
,WEIGHT_STRING()
returnsNULL
.Examples:
mysql>
SET @s = _utf8mb4 'AB' COLLATE utf8mb4_0900_ai_ci;
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | AB | 4142 | 1C471C60 | +------+---------+------------------------+mysql>
SET @s = _utf8mb4 'ab' COLLATE utf8mb4_0900_ai_ci;
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | ab | 6162 | 1C471C60 | +------+---------+------------------------+mysql>
SET @s = CAST('AB' AS BINARY);
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | AB | 4142 | 4142 | +------+---------+------------------------+mysql>
SET @s = CAST('ab' AS BINARY);
mysql>SELECT @s, HEX(@s), HEX(WEIGHT_STRING(@s));
+------+---------+------------------------+ | @s | HEX(@s) | HEX(WEIGHT_STRING(@s)) | +------+---------+------------------------+ | ab | 6162 | 6162 | +------+---------+------------------------+The preceding examples use
HEX()
to display theWEIGHT_STRING()
result. Because the result is a binary value,HEX()
can be especially useful when the result contains nonprinting values, to display it in printable form:mysql>
SET @s = CONVERT(X'C39F' USING utf8) COLLATE utf8_czech_ci;
mysql>SELECT HEX(WEIGHT_STRING(@s));
+------------------------+ | HEX(WEIGHT_STRING(@s)) | +------------------------+ | 0FEA0FEA | +------------------------+For non-
NULL
return values, the data type of the value isVARBINARY
if its length is within the maximum length forVARBINARY
, otherwise the data type isBLOB
.The
AS
clause may be given to cast the input string to a nonbinary or binary string and to force it to a given length:AS CHAR(
casts the string to a nonbinary string and pads it on the right with spaces to a length ofN
)N
characters.N
must be at least 1. IfN
is less than the length of the input string, the string is truncated toN
characters. No warning occurs for truncation.AS BINARY(
is similar but casts the string to a binary string,N
)N
is measured in bytes (not characters), and padding uses0x00
bytes (not spaces).
mysql>
SET NAMES 'latin1';
mysql>SELECT HEX(WEIGHT_STRING('ab' AS CHAR(4)));
+-------------------------------------+ | HEX(WEIGHT_STRING('ab' AS CHAR(4))) | +-------------------------------------+ | 41422020 | +-------------------------------------+ mysql>SET NAMES 'utf8';
mysql>SELECT HEX(WEIGHT_STRING('ab' AS CHAR(4)));
+-------------------------------------+ | HEX(WEIGHT_STRING('ab' AS CHAR(4))) | +-------------------------------------+ | 0041004200200020 | +-------------------------------------+mysql>
SELECT HEX(WEIGHT_STRING('ab' AS BINARY(4)));
+---------------------------------------+ | HEX(WEIGHT_STRING('ab' AS BINARY(4))) | +---------------------------------------+ | 61620000 | +---------------------------------------+The
flags
clause currently is unused.
If a string function is given a binary string as an argument, the resulting string is also a binary string. A number converted to a string is treated as a binary string. This affects only comparisons.
Normally, if any expression in a string comparison is case-sensitive, the comparison is performed in case-sensitive fashion.
expr
LIKEpat
[ESCAPE 'escape_char
']Pattern matching using an SQL pattern. Returns
1
(TRUE
) or0
(FALSE
). If eitherexpr
orpat
isNULL
, the result isNULL
.The pattern need not be a literal string. For example, it can be specified as a string expression or table column. In the latter case, the column must be defined as one of the MySQL string types (see Section 11.3, “String Data Types”).
Per the SQL standard,
LIKE
performs matching on a per-character basis, thus it can produce results different from the=
comparison operator:mysql>
SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+ | 'ä' LIKE 'ae' COLLATE latin1_german2_ci | +-----------------------------------------+ | 0 | +-----------------------------------------+ mysql>SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+ | 'ä' = 'ae' COLLATE latin1_german2_ci | +--------------------------------------+ | 1 | +--------------------------------------+In particular, trailing spaces are always significant. This differs from comparisons performed with the
=
operator, for which the significance of trailing spaces in nonbinary strings (CHAR
,VARCHAR
, andTEXT
values) depends on the pad attribute of the the collation used for the comparison. For more information, see Trailing Space Handling in Comparisons.With
LIKE
you can use the following two wildcard characters in the pattern:%
matches any number of characters, even zero characters._
matches exactly one character.
mysql>
SELECT 'David!' LIKE 'David_';
-> 1 mysql>SELECT 'David!' LIKE '%D%v%';
-> 1To test for literal instances of a wildcard character, precede it by the escape character. If you do not specify the
ESCAPE
character,\
is assumed.\%
matches one%
character.\_
matches one_
character.
mysql>
SELECT 'David!' LIKE 'David\_';
-> 0 mysql>SELECT 'David_' LIKE 'David\_';
-> 1To specify a different escape character, use the
ESCAPE
clause:mysql>
SELECT 'David_' LIKE 'David|_' ESCAPE '|';
-> 1The escape sequence should be empty or one character long. The expression must evaluate as a constant at execution time. If the
NO_BACKSLASH_ESCAPES
SQL mode is enabled, the sequence cannot be empty.The following two statements illustrate that string comparisons are not case-sensitive unless one of the operands is case-sensitive (uses a case-sensitive collation or is a binary string):
mysql>
SELECT 'abc' LIKE 'ABC';
-> 1 mysql>SELECT 'abc' LIKE _utf8mb4 'ABC' COLLATE utf8mb4_0900_as_cs;
-> 0 mysql>SELECT 'abc' LIKE _utf8mb4 'ABC' COLLATE utf8mb4_bin;
-> 0 mysql>SELECT 'abc' LIKE BINARY 'ABC';
-> 0As an extension to standard SQL, MySQL permits
LIKE
on numeric expressions.mysql>
SELECT 10 LIKE '1%';
-> 1NoteBecause MySQL uses C escape syntax in strings (for example,
\n
to represent a newline character), you must double any\
that you use inLIKE
strings. For example, to search for\n
, specify it as\\n
. To search for\
, specify it as\\\\
; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.Exception: At the end of the pattern string, backslash can be specified as
\\
. At the end of the string, backslash stands for itself because there is nothing following to escape. Suppose that a table contains the following values:mysql>
SELECT filename FROM t1;
+--------------+ | filename | +--------------+ | C: | | C:\ | | C:\Programs | | C:\Programs\ | +--------------+To test for values that end with backslash, you can match the values using either of the following patterns:
mysql>
SELECT filename, filename LIKE '%\\' FROM t1;
+--------------+---------------------+ | filename | filename LIKE '%\\' | +--------------+---------------------+ | C: | 0 | | C:\ | 1 | | C:\Programs | 0 | | C:\Programs\ | 1 | +--------------+---------------------+ mysql>SELECT filename, filename LIKE '%\\\\' FROM t1;
+--------------+-----------------------+ | filename | filename LIKE '%\\\\' | +--------------+-----------------------+ | C: | 0 | | C:\ | 1 | | C:\Programs | 0 | | C:\Programs\ | 1 | +--------------+-----------------------+expr
NOT LIKEpat
[ESCAPE 'escape_char
']This is the same as
NOT (
.expr
LIKEpat
[ESCAPE 'escape_char
'])NoteAggregate queries involving
NOT LIKE
comparisons with columns containingNULL
may yield unexpected results. For example, consider the following table and data:CREATE TABLE foo (bar VARCHAR(10)); INSERT INTO foo VALUES (NULL), (NULL);
The query
SELECT COUNT(*) FROM foo WHERE bar LIKE '%baz%';
returns0
. You might assume thatSELECT COUNT(*) FROM foo WHERE bar NOT LIKE '%baz%';
would return2
. However, this is not the case: The second query returns0
. This is becauseNULL NOT LIKE
always returnsexpr
NULL
, regardless of the value ofexpr
. The same is true for aggregate queries involvingNULL
and comparisons usingNOT RLIKE
orNOT REGEXP
. In such cases, you must test explicitly forNOT NULL
usingOR
(and notAND
), as shown here:SELECT COUNT(*) FROM foo WHERE bar NOT LIKE '%baz%' OR bar IS NULL;
STRCMP()
returns0
if the strings are the same,-1
if the first argument is smaller than the second according to the current sort order, and1
otherwise.mysql>
SELECT STRCMP('text', 'text2');
-> -1 mysql>SELECT STRCMP('text2', 'text');
-> 1 mysql>SELECT STRCMP('text', 'text');
-> 0STRCMP()
performs the comparison using the collation of the arguments.mysql>
SET @s1 = _utf8mb4 'x' COLLATE utf8mb4_0900_ai_ci;
mysql>SET @s2 = _utf8mb4 'X' COLLATE utf8mb4_0900_ai_ci;
mysql>SET @s3 = _utf8mb4 'x' COLLATE utf8mb4_0900_as_cs;
mysql>SET @s4 = _utf8mb4 'X' COLLATE utf8mb4_0900_as_cs;
mysql>SELECT STRCMP(@s1, @s2), STRCMP(@s3, @s4);
+------------------+------------------+ | STRCMP(@s1, @s2) | STRCMP(@s3, @s4) | +------------------+------------------+ | 0 | -1 | +------------------+------------------+If the collations are incompatible, one of the arguments must be converted to be compatible with the other. See Section 10.8.4, “Collation Coercibility in Expressions”.
mysql> SET @s1 = _utf8mb4 'x' COLLATE utf8mb4_0900_ai_ci; mysql> SET @s2 = _utf8mb4 'X' COLLATE utf8mb4_0900_ai_ci; mysql> SET @s3 = _utf8mb4 'x' COLLATE utf8mb4_0900_as_cs; mysql> SET @s4 = _utf8mb4 'X' COLLATE utf8mb4_0900_as_cs; --> mysql>
SELECT STRCMP(@s1, @s3);
ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_0900_ai_ci,IMPLICIT) and (utf8mb4_0900_as_cs,IMPLICIT) for operation 'strcmp' mysql>SELECT STRCMP(@s1, @s3 COLLATE utf8mb4_0900_ai_ci);
+---------------------------------------------+ | STRCMP(@s1, @s3 COLLATE utf8mb4_0900_ai_ci) | +---------------------------------------------+ | 0 | +---------------------------------------------+
Table 12.14 Regular Expression Functions and Operators
Name | Description |
---|---|
NOT REGEXP |
Negation of REGEXP |
REGEXP |
Whether string matches regular expression |
REGEXP_INSTR() |
Starting index of substring matching regular expression |
REGEXP_LIKE() |
Whether string matches regular expression |
REGEXP_REPLACE() |
Replace substrings matching regular expression |
REGEXP_SUBSTR() |
Return substring matching regular expression |
RLIKE |
Whether string matches regular expression |
A regular expression is a powerful way of specifying a pattern for a complex search. This section discusses the functions and operators available for regular expression matching and illustrates, with examples, some of the special characters and constructs that can be used for regular expression operations. See also Section 3.3.4.7, “Pattern Matching”.
MySQL implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe. (Prior to MySQL 8.0.4, MySQL used Henry Spencer's implementation of regular expressions, which operates in byte-wise fashion and is not multibyte safe. For information about ways in which applications that use regular expressions may be affected by the implementation change, see Regular Expression Compatibility Considerations.)
,expr
NOT REGEXPpat
expr
NOT RLIKEpat
This is the same as
NOT (
.expr
REGEXPpat
)
,expr
REGEXPpat
expr
RLIKEpat
Returns 1 if the string
expr
matches the regular expression specified by the patternpat
, 0 otherwise. Ifexpr
orpat
isNULL
, the return value isNULL
.REGEXP
andRLIKE
are synonyms forREGEXP_LIKE()
.For additional information about how matching occurs, see the description for
REGEXP_LIKE()
.mysql>
SELECT 'Michael!' REGEXP '.*';
+------------------------+ | 'Michael!' REGEXP '.*' | +------------------------+ | 1 | +------------------------+ mysql>SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
+---------------------------------------+ | 'new*\n*line' REGEXP 'new\\*.\\*line' | +---------------------------------------+ | 0 | +---------------------------------------+ mysql>SELECT 'a' REGEXP '^[a-d]';
+---------------------+ | 'a' REGEXP '^[a-d]' | +---------------------+ | 1 | +---------------------+ mysql>SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
+----------------+-----------------------+ | 'a' REGEXP 'A' | 'a' REGEXP BINARY 'A' | +----------------+-----------------------+ | 1 | 0 | +----------------+-----------------------+REGEXP_INSTR(
expr
,pat
[,pos
[,occurrence
[,return_option
[,match_type
]]]])Returns the starting index of the substring of the string
expr
that matches the regular expression specified by the patternpat
, 0 if there is no match. Ifexpr
orpat
isNULL
, the return value isNULL
. Character indexes begin at 1.REGEXP_INSTR()
takes these optional arguments:pos
: The position inexpr
at which to start the search. If omitted, the default is 1.occurrence
: Which occurrence of a match to search for. If omitted, the default is 1.return_option
: Which type of position to return. If this value is 0,REGEXP_INSTR()
returns the position of the matched substring's first character. If this value is 1,REGEXP_INSTR()
returns the position following the matched substring. If omitted, the default is 0.match_type
: A string that specifies how to perform matching. The meaning is as described forREGEXP_LIKE()
.
For additional information about how matching occurs, see the description for
REGEXP_LIKE()
.mysql>
SELECT REGEXP_INSTR('dog cat dog', 'dog');
+------------------------------------+ | REGEXP_INSTR('dog cat dog', 'dog') | +------------------------------------+ | 1 | +------------------------------------+ mysql>SELECT REGEXP_INSTR('dog cat dog', 'dog', 2);
+---------------------------------------+ | REGEXP_INSTR('dog cat dog', 'dog', 2) | +---------------------------------------+ | 9 | +---------------------------------------+ mysql>SELECT REGEXP_INSTR('aa aaa aaaa', 'a{2}');
+-------------------------------------+ | REGEXP_INSTR('aa aaa aaaa', 'a{2}') | +-------------------------------------+ | 1 | +-------------------------------------+ mysql>SELECT REGEXP_INSTR('aa aaa aaaa', 'a{4}');
+-------------------------------------+ | REGEXP_INSTR('aa aaa aaaa', 'a{4}') | +-------------------------------------+ | 8 | +-------------------------------------+REGEXP_LIKE(
expr
,pat
[,match_type
])Returns 1 if the string
expr
matches the regular expression specified by the patternpat
, 0 otherwise. Ifexpr
orpat
isNULL
, the return value isNULL
.The pattern can be an extended regular expression, the syntax for which is discussed in Regular Expression Syntax. The pattern need not be a literal string. For example, it can be specified as a string expression or table column.
The optional
match_type
argument is a string that may contain any or all the following characters specifying how to perform matching:c
: Case-sensitive matching.i
: Case-insensitive matching.m
: Multiple-line mode. Recognize line terminators within the string. The default behavior is to match line terminators only at the start and end of the string expression.n
: The.
character matches line terminators. The default is for.
matching to stop at the end of a line.u
: Unix-only line endings. Only the newline character is recognized as a line ending by the.
,^
, and$
match operators.
If characters specifying contradictory options are specified within
match_type
, the rightmost one takes precedence.By default, regular expression operations use the character set and collation of the
expr
andpat
arguments when deciding the type of a character and performing the comparison. If the arguments have different character sets or collations, coercibility rules apply as described in Section 10.8.4, “Collation Coercibility in Expressions”. Arguments may be specified with explicit collation indicators to change comparison behavior.mysql>
SELECT REGEXP_LIKE('CamelCase', 'CAMELCASE');
+---------------------------------------+ | REGEXP_LIKE('CamelCase', 'CAMELCASE') | +---------------------------------------+ | 1 | +---------------------------------------+ mysql>SELECT REGEXP_LIKE('CamelCase', 'CAMELCASE' COLLATE utf8mb4_0900_as_cs);
+------------------------------------------------------------------+ | REGEXP_LIKE('CamelCase', 'CAMELCASE' COLLATE utf8mb4_0900_as_cs) | +------------------------------------------------------------------+ | 0 | +------------------------------------------------------------------+match_type
may be specified with thec
ori
characters to override the default case sensitivity. Exception: If either argument is a binary string, the arguments are handled in case-sensitive fashion as binary strings, even ifmatch_type
contains thei
character.NoteBecause MySQL uses the C escape syntax in strings (for example,
\n
to represent the newline character), you must double any\
that you use in yourexpr
andpat
arguments.mysql>
SELECT REGEXP_LIKE('Michael!', '.*');
+-------------------------------+ | REGEXP_LIKE('Michael!', '.*') | +-------------------------------+ | 1 | +-------------------------------+ mysql>SELECT REGEXP_LIKE('new*\n*line', 'new\\*.\\*line');
+----------------------------------------------+ | REGEXP_LIKE('new*\n*line', 'new\\*.\\*line') | +----------------------------------------------+ | 0 | +----------------------------------------------+ mysql>SELECT REGEXP_LIKE('a', '^[a-d]');
+----------------------------+ | REGEXP_LIKE('a', '^[a-d]') | +----------------------------+ | 1 | +----------------------------+ mysql>SELECT REGEXP_LIKE('a', 'A'), REGEXP_LIKE('a', BINARY 'A');
+-----------------------+------------------------------+ | REGEXP_LIKE('a', 'A') | REGEXP_LIKE('a', BINARY 'A') | +-----------------------+------------------------------+ | 1 | 0 | +-----------------------+------------------------------+mysql>
SELECT REGEXP_LIKE('abc', 'ABC');
+---------------------------+ | REGEXP_LIKE('abc', 'ABC') | +---------------------------+ | 1 | +---------------------------+ mysql>SELECT REGEXP_LIKE('abc', 'ABC', 'c');
+--------------------------------+ | REGEXP_LIKE('abc', 'ABC', 'c') | +--------------------------------+ | 0 | +--------------------------------+REGEXP_REPLACE(
expr
,pat
,repl
[,pos
[,occurrence
[,match_type
]]])Replaces occurrences in the string
expr
that match the regular expression specified by the patternpat
with the replacement stringrepl
, and returns the resulting string. Ifexpr
,pat
, orrepl
isNULL
, the return value isNULL
.REGEXP_REPLACE()
takes these optional arguments:pos
: The position inexpr
at which to start the search. If omitted, the default is 1.occurrence
: Which occurrence of a match to replace. If omitted, the default is 0 (which means “replace all occurrences”).match_type
: A string that specifies how to perform matching. The meaning is as described forREGEXP_LIKE()
.
Prior to MySQL 8.0.17, the result returned by this function used the
UTF-16
character set; in MySQL 8.0.17 and later, the character set and collation of the expression searched for matches is used. (Bug #94203, Bug #29308212)For additional information about how matching occurs, see the description for
REGEXP_LIKE()
.mysql>
SELECT REGEXP_REPLACE('a b c', 'b', 'X');
+-----------------------------------+ | REGEXP_REPLACE('a b c', 'b', 'X') | +-----------------------------------+ | a X c | +-----------------------------------+ mysql>SELECT REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3);
+----------------------------------------------------+ | REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3) | +----------------------------------------------------+ | abc def X | +----------------------------------------------------+REGEXP_SUBSTR(
expr
,pat
[,pos
[,occurrence
[,match_type
]]])Returns the substring of the string
expr
that matches the regular expression specified by the patternpat
,NULL
if there is no match. Ifexpr
orpat
isNULL
, the return value isNULL
.REGEXP_SUBSTR()
takes these optional arguments:pos
: The position inexpr
at which to start the search. If omitted, the default is 1.occurrence
: Which occurrence of a match to search for. If omitted, the default is 1.match_type
: A string that specifies how to perform matching. The meaning is as described forREGEXP_LIKE()
.
Prior to MySQL 8.0.17, the result returned by this function used the
UTF-16
character set; in MySQL 8.0.17 and later, the character set and collation of the expression searched for matches is used. (Bug #94203, Bug #29308212)For additional information about how matching occurs, see the description for
REGEXP_LIKE()
.mysql>
SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
+----------------------------------------+ | REGEXP_SUBSTR('abc def ghi', '[a-z]+') | +----------------------------------------+ | abc | +----------------------------------------+ mysql>SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
+----------------------------------------------+ | REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3) | +----------------------------------------------+ | ghi | +----------------------------------------------+
A regular expression describes a set of strings. The simplest
regular expression is one that has no special characters in
it. For example, the regular expression
hello
matches hello
and
nothing else.
Nontrivial regular expressions use certain special constructs
so that they can match more than one string. For example, the
regular expression hello|world
contains the
|
alternation operator and matches either
the hello
or world
.
As a more complex example, the regular expression
B[an]*s
matches any of the strings
Bananas
, Baaaaas
,
Bs
, and any other string starting with a
B
, ending with an s
, and
containing any number of a
or
n
characters in between.
The following list covers some of the basic special characters and constructs that can be used in regular expressions. For information about the full regular expression syntax supported by the ICU library used to implement regular expression support, visit the International Components for Unicode website.
^
Match the beginning of a string.
mysql>
SELECT REGEXP_LIKE('fo\nfo', '^fo$');
-> 0 mysql>SELECT REGEXP_LIKE('fofo', '^fo');
-> 1$
Match the end of a string.
mysql>
SELECT REGEXP_LIKE('fo\no', '^fo\no$');
-> 1 mysql>SELECT REGEXP_LIKE('fo\no', '^fo$');
-> 0.
Match any character (including carriage return and newline, although to match these in the middle of a string, the
m
(multiple line) match-control character or the(?m)
within-pattern modifier must be given).mysql>
SELECT REGEXP_LIKE('fofo', '^f.*$');
-> 1 mysql>SELECT REGEXP_LIKE('fo\r\nfo', '^f.*$');
-> 0 mysql>SELECT REGEXP_LIKE('fo\r\nfo', '^f.*$', 'm');
-> 1 mysql>SELECT REGEXP_LIKE('fo\r\nfo', '(?m)^f.*$');
-> 1a*
Match any sequence of zero or more
a
characters.mysql>
SELECT REGEXP_LIKE('Ban', '^Ba*n');
-> 1 mysql>SELECT REGEXP_LIKE('Baaan', '^Ba*n');
-> 1 mysql>SELECT REGEXP_LIKE('Bn', '^Ba*n');
-> 1a+
Match any sequence of one or more
a
characters.mysql>
SELECT REGEXP_LIKE('Ban', '^Ba+n');
-> 1 mysql>SELECT REGEXP_LIKE('Bn', '^Ba+n');
-> 0a?
Match either zero or one
a
character.mysql>
SELECT REGEXP_LIKE('Bn', '^Ba?n');
-> 1 mysql>SELECT REGEXP_LIKE('Ban', '^Ba?n');
-> 1 mysql>SELECT REGEXP_LIKE('Baan', '^Ba?n');
-> 0de|abc
Alternation; match either of the sequences
de
orabc
.mysql>
SELECT REGEXP_LIKE('pi', 'pi|apa');
-> 1 mysql>SELECT REGEXP_LIKE('axe', 'pi|apa');
-> 0 mysql>SELECT REGEXP_LIKE('apa', 'pi|apa');
-> 1 mysql>SELECT REGEXP_LIKE('apa', '^(pi|apa)$');
-> 1 mysql>SELECT REGEXP_LIKE('pi', '^(pi|apa)$');
-> 1 mysql>SELECT REGEXP_LIKE('pix', '^(pi|apa)$');
-> 0(abc)*
Match zero or more instances of the sequence
abc
.mysql>
SELECT REGEXP_LIKE('pi', '^(pi)*$');
-> 1 mysql>SELECT REGEXP_LIKE('pip', '^(pi)*$');
-> 0 mysql>SELECT REGEXP_LIKE('pipi', '^(pi)*$');
-> 1{1}
,{2,3}
Repetition;
{
andn
}{
notation provide a more general way of writing regular expressions that match many occurrences of the previous atom (or “piece”) of the pattern.m
,n
}m
andn
are integers.a*
Can be written as
a{0,}
.a+
Can be written as
a{1,}
.a?
Can be written as
a{0,1}
.
To be more precise,
a{
matches exactlyn
}n
instances ofa
.a{
matchesn
,}n
or more instances ofa
.a{
matchesm
,n
}m
throughn
instances ofa
, inclusive. If bothm
andn
are given,m
must be less than or equal ton
.mysql>
SELECT REGEXP_LIKE('abcde', 'a[bcd]{2}e');
-> 0 mysql>SELECT REGEXP_LIKE('abcde', 'a[bcd]{3}e');
-> 1 mysql>SELECT REGEXP_LIKE('abcde', 'a[bcd]{1,10}e');
-> 1[a-dX]
,[^a-dX]
Matches any character that is (or is not, if
^
is used) eithera
,b
,c
,d
orX
. A-
character between two other characters forms a range that matches all characters from the first character to the second. For example,[0-9]
matches any decimal digit. To include a literal]
character, it must immediately follow the opening bracket[
. To include a literal-
character, it must be written first or last. Any character that does not have a defined special meaning inside a[]
pair matches only itself.mysql>
SELECT REGEXP_LIKE('aXbc', '[a-dXYZ]');
-> 1 mysql>SELECT REGEXP_LIKE('aXbc', '^[a-dXYZ]$');
-> 0 mysql>SELECT REGEXP_LIKE('aXbc', '^[a-dXYZ]+$');
-> 1 mysql>SELECT REGEXP_LIKE('aXbc', '^[^a-dXYZ]+$');
-> 0 mysql>SELECT REGEXP_LIKE('gheis', '^[^a-dXYZ]+$');
-> 1 mysql>SELECT REGEXP_LIKE('gheisa', '^[^a-dXYZ]+$');
-> 0[=character_class=]
Within a bracket expression (written using
[
and]
),[=character_class=]
represents an equivalence class. It matches all characters with the same collation value, including itself. For example, ifo
and(+)
are the members of an equivalence class,[[=o=]]
,[[=(+)=]]
, and[o(+)]
are all synonymous. An equivalence class may not be used as an endpoint of a range.[:character_class:]
Within a bracket expression (written using
[
and]
),[:character_class:]
represents a character class that matches all characters belonging to that class. The following table lists the standard class names. These names stand for the character classes defined in thectype(3)
manual page. A particular locale may provide other class names. A character class may not be used as an endpoint of a range.Character Class Name Meaning alnum
Alphanumeric characters alpha
Alphabetic characters blank
Whitespace characters cntrl
Control characters digit
Digit characters graph
Graphic characters lower
Lowercase alphabetic characters print
Graphic or space characters punct
Punctuation characters space
Space, tab, newline, and carriage return upper
Uppercase alphabetic characters xdigit
Hexadecimal digit characters mysql>
SELECT REGEXP_LIKE('justalnums', '[[:alnum:]]+');
-> 1 mysql>SELECT REGEXP_LIKE('!!', '[[:alnum:]]+');
-> 0
To use a literal instance of a special character in a regular
expression, precede it by two backslash (\) characters. The
MySQL parser interprets one of the backslashes, and the
regular expression library interprets the other. For example,
to match the string 1+2
that contains the
special +
character, only the last of the
following regular expressions is the correct one:
mysql>SELECT REGEXP_LIKE('1+2', '1+2');
-> 0 mysql>SELECT REGEXP_LIKE('1+2', '1\+2');
-> 0 mysql>SELECT REGEXP_LIKE('1+2', '1\\+2');
-> 1
REGEXP_LIKE()
and similar
functions use resources that can be controlled by setting
system variables:
The match engine uses memory for its internal stack. To control the maximum available memory for the stack in bytes, set the
regexp_stack_limit
system variable.The match engine operates in steps. To control the maximum number of steps performed by the engine (and thus indirectly the execution time), set the
regexp_time_limit
system variable. Because this limit is expressed as number of steps, it affects execution time only indirectly. Typically, it is on the order of milliseconds.
Prior to MySQL 8.0.4, MySQL used the Henry Spencer regular expression library to support regular expression operations, rather than International Components for Unicode (ICU). The following discussion describes differences between the Spencer and ICU libraries that may affect applications:
With the Spencer library, the
REGEXP
andRLIKE
operators work in byte-wise fashion, so they are not multibyte safe and may produce unexpected results with multibyte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.ICU has full Unicode support and is multibyte safe. Its regular expression functions treat all strings as
UTF-16
. You should keep in mind that positional indexes are based on 16-bit chunks and not on code points. This means that, when passed to such functions, characters using more than one chunk may produce unanticipated results, such as those shown here:mysql>
SELECT REGEXP_INSTR('🍣🍣b', 'b');
+--------------------------+ | REGEXP_INSTR('??b', 'b') | +--------------------------+ | 5 | +--------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_INSTR('🍣🍣bxxx', 'b', 4);
+--------------------------------+ | REGEXP_INSTR('??bxxx', 'b', 4) | +--------------------------------+ | 5 | +--------------------------------+ 1 row in set (0.00 sec)Characters within the Unicode Basic Multilingual Plane, which includes characters used by most modern languages, are safe in this regard:
mysql>
SELECT REGEXP_INSTR('бжb', 'b');
+----------------------------+ | REGEXP_INSTR('бжb', 'b') | +----------------------------+ | 3 | +----------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_INSTR('עבb', 'b');
+----------------------------+ | REGEXP_INSTR('עבb', 'b') | +----------------------------+ | 3 | +----------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_INSTR('µå周çб', '周');
+------------------------------------+ | REGEXP_INSTR('µå周çб', '周') | +------------------------------------+ | 3 | +------------------------------------+ 1 row in set (0.00 sec)Emoji, such as the “sushi” character
🍣
(U+1F363) used in the first two examples, are not included in the Basic Multilingual Plane, but rather in Unicode's Supplementary Multilingual Plane. Another issue can arise with emoji and other 4-byte characters whenREGEXP_SUBSTR()
or a similar function begins searching in the middle of a character. Each of the two statements in the following example starts from the second 2-byte position in the first argument. The first statement works on a string consisting solely of 2-byte (BMP) characters. The second statement contains 4-byte characters which are incorrectly interpreted in the result because the first two bytes are stripped off and so the remainder of the character data is misaligned.mysql>
SELECT REGEXP_SUBSTR('周周周周', '.*', 2);
+----------------------------------------+ | REGEXP_SUBSTR('周周周周', '.*', 2) | +----------------------------------------+ | 周周周 | +----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT REGEXP_SUBSTR('🍣🍣🍣🍣', '.*', 2);
+--------------------------------+ | REGEXP_SUBSTR('????', '.*', 2) | +--------------------------------+ | ?㳟揘㳟揘㳟揘 | +--------------------------------+ 1 row in set (0.00 sec)For the
.
operator, the Spencer library matches line-terminator characters (carriage return, newline) anywhere in string expressions, including in the middle. To match line terminator characters in the middle of strings with ICU, specify them
match-control character.The Spencer library supports word-beginning and word-end boundary markers (
[[:<:]]
and[[:>:]]
notation). ICU does not. For ICU, you can use\b
to match word boundaries; double the backslash because MySQL interprets it as the escape character within strings.The Spencer library supports collating element bracket expressions (
[.characters.]
notation). ICU does not.For repetition counts (
{n}
and{m,n}
notation), the Spencer library has a maximum of 255. ICU has no such limit, although the maximum number of match engine steps can be limited by setting theregexp_time_limit
system variable.ICU interprets parentheses as metacharacters. To specify a literal open or close parenthesis
(
in a regular expression, it must be escaped:mysql>
SELECT REGEXP_LIKE('(', '(');
ERROR 3692 (HY000): Mismatched parenthesis in regular expression. mysql>SELECT REGEXP_LIKE('(', '\\(');
+-------------------------+ | REGEXP_LIKE('(', '\\(') | +-------------------------+ | 1 | +-------------------------+ mysql>SELECT REGEXP_LIKE(')', ')');
ERROR 3692 (HY000): Mismatched parenthesis in regular expression. mysql>SELECT REGEXP_LIKE(')', '\\)');
+-------------------------+ | REGEXP_LIKE(')', '\\)') | +-------------------------+ | 1 | +-------------------------+ICU also interprets square brackets as metacharacters, but only the opening square bracket need be escaped to be used as a literal character:
mysql>
SELECT REGEXP_LIKE('[', '[');
ERROR 3696 (HY000): The regular expression contains an unclosed bracket expression. mysql>SELECT REGEXP_LIKE('[', '\\[');
+-------------------------+ | REGEXP_LIKE('[', '\\[') | +-------------------------+ | 1 | +-------------------------+ mysql>SELECT REGEXP_LIKE(']', ']');
+-----------------------+ | REGEXP_LIKE(']', ']') | +-----------------------+ | 1 | +-----------------------+
MySQL has many operators and functions that return a string. This section answers the question: What is the character set and collation of such a string?
For simple functions that take string input and return a string
result as output, the output's character set and collation are
the same as those of the principal input value. For example,
UPPER(
returns a string with the same character string and collation as
X
)X
. The same applies for
INSTR()
,
LCASE()
,
LOWER()
,
LTRIM()
,
MID()
,
REPEAT()
,
REPLACE()
,
REVERSE()
,
RIGHT()
,
RPAD()
,
RTRIM()
,
SOUNDEX()
,
SUBSTRING()
,
TRIM()
,
UCASE()
, and
UPPER()
.
The REPLACE()
function, unlike
all other functions, always ignores the collation of the
string input and performs a case-sensitive comparison.
If a string input or function result is a binary string, the
string has the binary
character set and
collation. This can be checked by using the
CHARSET()
and
COLLATION()
functions, both of
which return binary
for a binary string
argument:
mysql> SELECT CHARSET(BINARY 'a'), COLLATION(BINARY 'a');
+---------------------+-----------------------+
| CHARSET(BINARY 'a') | COLLATION(BINARY 'a') |
+---------------------+-----------------------+
| binary | binary |
+---------------------+-----------------------+
For operations that combine multiple string inputs and return a single string output, the “aggregation rules” of standard SQL apply for determining the collation of the result:
If an explicit
COLLATE
occurs, useY
Y
.If explicit
COLLATE
andY
COLLATE
occur, raise an error.Z
Otherwise, if all collations are
Y
, useY
.Otherwise, the result has no collation.
For example, with CASE ... WHEN a THEN b WHEN b THEN c
COLLATE
, the
resulting collation is X
ENDX
. The same
applies for UNION
,
||
,
CONCAT()
,
ELT()
,
GREATEST()
,
IF()
, and
LEAST()
.
For operations that convert to character data, the character set
and collation of the strings that result from the operations are
defined by the
character_set_connection
and
collation_connection
system
variables that determine the default connection character set
and collation (see Section 10.4, “Connection Character Sets and Collations”). This
applies only to BIN_TO_UUID()
,
CAST()
,
CONV()
,
FORMAT()
,
HEX()
, and
SPACE()
.
An exception to the preceding principle occurs for expressions
for virtual generated columns. In such expressions, the table
character set is used for
BIN_TO_UUID()
,
CONV()
, or
HEX()
results, regardless of
connection character set.
If there is any question about the character set or collation of
the result returned by a string function, use the
CHARSET()
or
COLLATION()
function to find out:
mysql>SELECT USER(), CHARSET(USER()), COLLATION(USER());
+----------------+-----------------+-------------------+ | USER() | CHARSET(USER()) | COLLATION(USER()) | +----------------+-----------------+-------------------+ | test@localhost | utf8 | utf8_general_ci | +----------------+-----------------+-------------------+ mysql>SELECT CHARSET(COMPRESS('abc')), COLLATION(COMPRESS('abc'));
+--------------------------+----------------------------+ | CHARSET(COMPRESS('abc')) | COLLATION(COMPRESS('abc')) | +--------------------------+----------------------------+ | binary | binary | +--------------------------+----------------------------+
MySQL uses what is known as a proleptic Gregorian calendar.
Every country that has switched from the Julian to the Gregorian calendar has had to discard at least ten days during the switch. To see how this works, consider the month of October 1582, when the first Julian-to-Gregorian switch occurred.
Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
There are no dates between October 4 and October 15. This discontinuity is called the cutover. Any dates before the cutover are Julian, and any dates following the cutover are Gregorian. Dates during a cutover are nonexistent.
A calendar applied to dates when it was not actually in use is
called proleptic. Thus, if
we assume there was never a cutover and Gregorian rules always
rule, we have a proleptic Gregorian calendar. This is what is used
by MySQL, as is required by standard SQL. For this reason, dates
prior to the cutover stored as MySQL
DATE
or
DATETIME
values must be adjusted to
compensate for the difference. It is important to realize that the
cutover did not occur at the same time in all countries, and that
the later it happened, the more days were lost. For example, in
Great Britain, it took place in 1752, when Wednesday September 2
was followed by Thursday September 14. Russia remained on the
Julian calendar until 1918, losing 13 days in the process, and
what is popularly referred to as its “October
Revolution” occurred in November according to the Gregorian
calendar.
- 12.10.1 Natural Language Full-Text Searches
- 12.10.2 Boolean Full-Text Searches
- 12.10.3 Full-Text Searches with Query Expansion
- 12.10.4 Full-Text Stopwords
- 12.10.5 Full-Text Restrictions
- 12.10.6 Fine-Tuning MySQL Full-Text Search
- 12.10.7 Adding a User-Defined Collation for Full-Text Indexing
- 12.10.8 ngram Full-Text Parser
- 12.10.9 MeCab Full-Text Parser Plugin
MATCH
(
col1
,col2
,...)
AGAINST (expr
[search_modifier
])
search_modifier:
{
IN NATURAL LANGUAGE MODE
| IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
| IN BOOLEAN MODE
| WITH QUERY EXPANSION
}
MySQL has support for full-text indexing and searching:
A full-text index in MySQL is an index of type
FULLTEXT
.Full-text indexes can be used only with
InnoDB
orMyISAM
tables, and can be created only forCHAR
,VARCHAR
, orTEXT
columns.MySQL provides a built-in full-text ngram parser that supports Chinese, Japanese, and Korean (CJK), and an installable MeCab full-text parser plugin for Japanese. Parsing differences are outlined in Section 12.10.8, “ngram Full-Text Parser”, and Section 12.10.9, “MeCab Full-Text Parser Plugin”.
A
FULLTEXT
index definition can be given in theCREATE TABLE
statement when a table is created, or added later usingALTER TABLE
orCREATE INDEX
.For large data sets, it is much faster to load your data into a table that has no
FULLTEXT
index and then create the index after that, than to load data into a table that has an existingFULLTEXT
index.
Full-text searching is performed using
MATCH() ... AGAINST
syntax.
MATCH()
takes a comma-separated
list that names the columns to be searched.
AGAINST
takes a string to search for, and an
optional modifier that indicates what type of search to perform.
The search string must be a string value that is constant during
query evaluation. This rules out, for example, a table column
because that can differ for each row.
There are three types of full-text searches:
A natural language search interprets the search string as a phrase in natural human language (a phrase in free text). There are no special operators, with the exception of double quote (") characters. The stopword list applies. For more information about stopword lists, see Section 12.10.4, “Full-Text Stopwords”.
Full-text searches are natural language searches if the
IN NATURAL LANGUAGE MODE
modifier is given or if no modifier is given. For more information, see Section 12.10.1, “Natural Language Full-Text Searches”.A boolean search interprets the search string using the rules of a special query language. The string contains the words to search for. It can also contain operators that specify requirements such that a word must be present or absent in matching rows, or that it should be weighted higher or lower than usual. Certain common words (stopwords) are omitted from the search index and do not match if present in the search string. The
IN BOOLEAN MODE
modifier specifies a boolean search. For more information, see Section 12.10.2, “Boolean Full-Text Searches”.A query expansion search is a modification of a natural language search. The search string is used to perform a natural language search. Then words from the most relevant rows returned by the search are added to the search string and the search is done again. The query returns the rows from the second search. The
IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
orWITH QUERY EXPANSION
modifier specifies a query expansion search. For more information, see Section 12.10.3, “Full-Text Searches with Query Expansion”.
For information about FULLTEXT
query
performance, see Section 8.3.5, “Column Indexes”.
For more information about InnoDB
FULLTEXT
indexes, see
Section 15.6.2.4, “InnoDB FULLTEXT Indexes”.
Constraints on full-text searching are listed in Section 12.10.5, “Full-Text Restrictions”.
The myisam_ftdump utility dumps the contents of
a MyISAM
full-text index. This may be helpful
for debugging full-text queries. See
Section 4.6.3, “myisam_ftdump — Display Full-Text Index information”.
By default or with the IN NATURAL LANGUAGE
MODE
modifier, the
MATCH()
function performs a
natural language search for a string against a
text collection. A
collection is a set of one or more columns included in a
FULLTEXT
index. The search string is given as
the argument to AGAINST()
. For each row in
the table, MATCH()
returns a
relevance value; that is, a similarity measure between the
search string and the text in that row in the columns named in
the MATCH()
list.
mysql>CREATE TABLE articles (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
body TEXT,
FULLTEXT (title,body)
) ENGINE=InnoDB;
Query OK, 0 rows affected (0.08 sec) mysql>INSERT INTO articles (title,body) VALUES
('MySQL Tutorial','DBMS stands for DataBase ...'),
('How To Use MySQL Well','After you went through a ...'),
('Optimizing MySQL','In this tutorial, we show ...'),
('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
('MySQL vs. YourSQL','In the following database comparison ...'),
('MySQL Security','When configured properly, MySQL ...');
Query OK, 6 rows affected (0.01 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+ | id | title | body | +----+-------------------+------------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 5 | MySQL vs. YourSQL | In the following database comparison ... | +----+-------------------+------------------------------------------+ 2 rows in set (0.00 sec)
By default, the search is performed in case-insensitive fashion.
To perform a case-sensitive full-text search, use a
case-sensitive or binary collation for the indexed columns. For
example, a column that uses the utf8mb4
character set of can be assigned a collation of
utf8mb4_0900_as_cs
or
utf8mb4_bin
to make it case-sensitive for
full-text searches.
When MATCH()
is used in a
WHERE
clause, as in the example shown
earlier, the rows returned are automatically sorted with the
highest relevance first. Relevance values are nonnegative
floating-point numbers. Zero relevance means no similarity.
Relevance is computed based on the number of words in the row
(document), the number of unique words in the row, the total
number of words in the collection, and the number of rows that
contain a particular word.
The term “document” may be used interchangeably with the term “row”, and both terms refer to the indexed part of the row. The term “collection” refers to the indexed columns and encompasses all rows.
To simply count matches, you could use a query like this:
mysql>SELECT COUNT(*) FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----------+ | COUNT(*) | +----------+ | 2 | +----------+ 1 row in set (0.00 sec)
You might find it quicker to rewrite the query as follows:
mysql>SELECT
COUNT(IF(MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE), 1, NULL))
AS count
FROM articles;
+-------+ | count | +-------+ | 2 | +-------+ 1 row in set (0.03 sec)
The first query does some extra work (sorting the results by
relevance) but also can use an index lookup based on the
WHERE
clause. The index lookup might make the
first query faster if the search matches few rows. The second
query performs a full table scan, which might be faster than the
index lookup if the search term was present in most rows.
For natural-language full-text searches, the columns named in
the MATCH()
function must be the
same columns included in some FULLTEXT
index
in your table. For the preceding query, note that the columns
named in the MATCH()
function
(title
and body
) are the
same as those named in the definition of the
article
table's FULLTEXT
index. To search the title
or
body
separately, you would create separate
FULLTEXT
indexes for each column.
You can also perform a boolean search or a search with query expansion. These search types are described in Section 12.10.2, “Boolean Full-Text Searches”, and Section 12.10.3, “Full-Text Searches with Query Expansion”.
A full-text search that uses an index can name columns only from
a single table in the MATCH()
clause because an index cannot span multiple tables. For
MyISAM
tables, a boolean search can be done
in the absence of an index (albeit more slowly), in which case
it is possible to name columns from multiple tables.
The preceding example is a basic illustration that shows how to
use the MATCH()
function where
rows are returned in order of decreasing relevance. The next
example shows how to retrieve the relevance values explicitly.
Returned rows are not ordered because the
SELECT
statement includes neither
WHERE
nor ORDER BY
clauses:
mysql>SELECT id, MATCH (title,body)
AGAINST ('Tutorial' IN NATURAL LANGUAGE MODE) AS score
FROM articles;
+----+---------------------+ | id | score | +----+---------------------+ | 1 | 0.22764469683170319 | | 2 | 0 | | 3 | 0.22764469683170319 | | 4 | 0 | | 5 | 0 | | 6 | 0 | +----+---------------------+ 6 rows in set (0.00 sec)
The following example is more complex. The query returns the
relevance values and it also sorts the rows in order of
decreasing relevance. To achieve this result, specify
MATCH()
twice: once in the
SELECT
list and once in the
WHERE
clause. This causes no additional
overhead, because the MySQL optimizer notices that the two
MATCH()
calls are identical and
invokes the full-text search code only once.
mysql>SELECT id, body, MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE) AS score
FROM articles WHERE MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE);
+----+-------------------------------------+-----------------+ | id | body | score | +----+-------------------------------------+-----------------+ | 4 | 1. Never run mysqld as root. 2. ... | 1.5219271183014 | | 6 | When configured properly, MySQL ... | 1.3114095926285 | +----+-------------------------------------+-----------------+ 2 rows in set (0.00 sec)
A phrase that is enclosed within double quote
("
) characters matches only rows that contain
the phrase literally, as it was typed. The
full-text engine splits the phrase into words and performs a
search in the FULLTEXT
index for the words.
Nonword characters need not be matched exactly: Phrase searching
requires only that matches contain exactly the same words as the
phrase and in the same order. For example, "test
phrase"
matches "test, phrase"
. If
the phrase contains no words that are in the index, the result
is empty. For example, if all words are either stopwords or
shorter than the minimum length of indexed words, the result is
empty.
The MySQL FULLTEXT
implementation regards any
sequence of true word characters (letters, digits, and
underscores) as a word. That sequence may also contain
apostrophes ('
), but not more than one in a
row. This means that aaa'bbb
is regarded as
one word, but aaa''bbb
is regarded as two
words. Apostrophes at the beginning or the end of a word are
stripped by the FULLTEXT
parser;
'aaa'bbb'
would be parsed as
aaa'bbb
.
The built-in FULLTEXT
parser determines where
words start and end by looking for certain delimiter characters;
for example,
(space),
,
(comma), and .
(period).
If words are not separated by delimiters (as in, for example,
Chinese), the built-in FULLTEXT
parser cannot
determine where a word begins or ends. To be able to add words
or other indexed terms in such languages to a
FULLTEXT
index that uses the built-in
FULLTEXT
parser, you must preprocess them so
that they are separated by some arbitrary delimiter.
Alternatively, you can create FULLTEXT
indexes using the ngram parser plugin (for Chinese, Japanese, or
Korean) or the MeCab parser plugin (for Japanese).
It is possible to write a plugin that replaces the built-in
full-text parser. For details, see The MySQL Plugin API.
For example parser plugin source code, see the
plugin/fulltext
directory of a MySQL source
distribution.
Some words are ignored in full-text searches:
Any word that is too short is ignored. The default minimum length of words that are found by full-text searches is three characters for
InnoDB
search indexes, or four characters forMyISAM
. You can control the cutoff by setting a configuration option before creating the index:innodb_ft_min_token_size
configuration option forInnoDB
search indexes, orft_min_word_len
forMyISAM
.NoteThis behavior does not apply to
FULLTEXT
indexes that use the ngram parser. For the ngram parser, token length is defined by thengram_token_size
option.Words in the stopword list are ignored. A stopword is a word such as “the” or “some” that is so common that it is considered to have zero semantic value. There is a built-in stopword list, but it can be overridden by a user-defined list. The stopword lists and related configuration options are different for
InnoDB
search indexes andMyISAM
ones. Stopword processing is controlled by the configuration optionsinnodb_ft_enable_stopword
,innodb_ft_server_stopword_table
, andinnodb_ft_user_stopword_table
forInnoDB
search indexes, andft_stopword_file
forMyISAM
ones.
See Section 12.10.4, “Full-Text Stopwords” to view default stopword lists and how to change them. The default minimum word length can be changed as described in Section 12.10.6, “Fine-Tuning MySQL Full-Text Search”.
Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Thus, a word that is present in many documents has a lower weight, because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row. This technique works best with large collections.
For very small tables, word distribution does not adequately
reflect their semantic value, and this model may sometimes
produce bizarre results for search indexes on
MyISAM
tables. For example, although the
word “MySQL” is present in every row of the
articles
table shown earlier, a search for
the word in a MyISAM
search index produces
no results:
mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('MySQL' IN NATURAL LANGUAGE MODE);
Empty set (0.00 sec)
The search result is empty because the word “MySQL” is present in at least 50% of the rows, and so is effectively treated as a stopword. This filtering technique is more suitable for large data sets, where you might not want the result set to return every second row from a 1GB table, than for small data sets where it might cause poor results for popular terms.
The 50% threshold can surprise you when you first try
full-text searching to see how it works, and makes
InnoDB
tables more suited to
experimentation with full-text searches. If you create a
MyISAM
table and insert only one or two
rows of text into it, every word in the text occurs in at
least 50% of the rows. As a result, no search returns any
results until the table contains more rows. Users who need to
bypass the 50% limitation can build search indexes on
InnoDB
tables, or use the boolean search
mode explained in Section 12.10.2, “Boolean Full-Text Searches”.
MySQL can perform boolean full-text searches using the
IN BOOLEAN MODE
modifier. With this modifier,
certain characters have special meaning at the beginning or end
of words in the search string. In the following query, the
+
and -
operators indicate
that a word must be present or absent, respectively, for a match
to occur. Thus, the query retrieves all the rows that contain
the word “MySQL” but that do
not contain the word
“YourSQL”:
mysql>SELECT * FROM articles WHERE MATCH (title,body)
AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+ | id | title | body | +----+-----------------------+-------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 2 | How To Use MySQL Well | After you went through a ... | | 3 | Optimizing MySQL | In this tutorial, we show ... | | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | | 6 | MySQL Security | When configured properly, MySQL ... | +----+-----------------------+-------------------------------------+
In implementing this feature, MySQL uses what is sometimes referred to as implied Boolean logic, in which
+
stands forAND
-
stands forNOT
[no operator] implies
OR
Boolean full-text searches have these characteristics:
They do not automatically sort rows in order of decreasing relevance.
InnoDB
tables require aFULLTEXT
index on all columns of theMATCH()
expression to perform boolean queries. Boolean queries against aMyISAM
search index can work even without aFULLTEXT
index, although a search executed in this fashion would be quite slow.The minimum and maximum word length full-text parameters apply to
FULLTEXT
indexes created using the built-inFULLTEXT
parser and MeCab parser plugin.innodb_ft_min_token_size
andinnodb_ft_max_token_size
are used forInnoDB
search indexes.ft_min_word_len
andft_max_word_len
are used forMyISAM
search indexes.Minimum and maximum word length full-text parameters do not apply to
FULLTEXT
indexes created using the ngram parser. ngram token size is defined by thengram_token_size
option.The stopword list applies, controlled by
innodb_ft_enable_stopword
,innodb_ft_server_stopword_table
, andinnodb_ft_user_stopword_table
forInnoDB
search indexes, andft_stopword_file
forMyISAM
ones.InnoDB
full-text search does not support the use of multiple operators on a single search word, as in this example:'++apple'
. Use of multiple operators on a single search word returns a syntax error to standard out. MyISAM full-text search successfully processes the same search, ignoring all operators except for the operator immediately adjacent to the search word.InnoDB
full-text search only supports leading plus or minus signs. For example,InnoDB
supports'+apple'
but does not support'apple+'
. Specifying a trailing plus or minus sign causesInnoDB
to report a syntax error.InnoDB
full-text search does not support the use of a leading plus sign with wildcard ('+*'
), a plus and minus sign combination ('+-'
), or leading a plus and minus sign combination ('+-apple'
). These invalid queries return a syntax error.InnoDB
full-text search does not support the use of the@
symbol in boolean full-text searches. The@
symbol is reserved for use by the@distance
proximity search operator.They do not use the 50% threshold that applies to
MyISAM
search indexes.
The boolean full-text search capability supports the following operators:
+
A leading or trailing plus sign indicates that this word must be present in each row that is returned.
InnoDB
only supports leading plus signs.-
A leading or trailing minus sign indicates that this word must not be present in any of the rows that are returned.
InnoDB
only supports leading minus signs.Note: The
-
operator acts only to exclude rows that are otherwise matched by other search terms. Thus, a boolean-mode search that contains only terms preceded by-
returns an empty result. It does not return “all rows except those containing any of the excluded terms.”(no operator)
By default (when neither
+
nor-
is specified), the word is optional, but the rows that contain it are rated higher. This mimics the behavior ofMATCH() ... AGAINST()
without theIN BOOLEAN MODE
modifier.@
distance
This operator works on
InnoDB
tables only. It tests whether two or more words all start within a specified distance from each other, measured in words. Specify the search words within a double-quoted string immediately before the@
operator, for example,distance
MATCH(col1) AGAINST('"word1 word2 word3" @8' IN BOOLEAN MODE)
> <
These two operators are used to change a word's contribution to the relevance value that is assigned to a row. The
>
operator increases the contribution and the<
operator decreases it. See the example following this list.( )
Parentheses group words into subexpressions. Parenthesized groups can be nested.
~
A leading tilde acts as a negation operator, causing the word's contribution to the row's relevance to be negative. This is useful for marking “noise” words. A row containing such a word is rated lower than others, but is not excluded altogether, as it would be with the
-
operator.*
The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it is appended to the word to be affected. Words match if they begin with the word preceding the
*
operator.If a word is specified with the truncation operator, it is not stripped from a boolean query, even if it is too short or a stopword. Whether a word is too short is determined from the
innodb_ft_min_token_size
setting forInnoDB
tables, orft_min_word_len
forMyISAM
tables. These options are not applicable toFULLTEXT
indexes that use the ngram parser.The wildcarded word is considered as a prefix that must be present at the start of one or more words. If the minimum word length is 4, a search for
'+
could return fewer rows than a search forword
+the*''+
, because the second query ignores the too-short search termword
+the'the
."
A phrase that is enclosed within double quote (
"
) characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in theFULLTEXT
index for the words. Nonword characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example,"test phrase"
matches"test, phrase"
.If the phrase contains no words that are in the index, the result is empty. The words might not be in the index because of a combination of factors: if they do not exist in the text, are stopwords, or are shorter than the minimum length of indexed words.
The following examples demonstrate some search strings that use boolean full-text operators:
'apple banana'
Find rows that contain at least one of the two words.
'+apple +juice'
Find rows that contain both words.
'+apple macintosh'
Find rows that contain the word “apple”, but rank rows higher if they also contain “macintosh”.
'+apple -macintosh'
Find rows that contain the word “apple” but not “macintosh”.
'+apple ~macintosh'
Find rows that contain the word “apple”, but if the row also contains the word “macintosh”, rate it lower than if row does not. This is “softer” than a search for
'+apple -macintosh'
, for which the presence of “macintosh” causes the row not to be returned at all.'+apple +(>turnover <strudel)'
Find rows that contain the words “apple” and “turnover”, or “apple” and “strudel” (in any order), but rank “apple turnover” higher than “apple strudel”.
'apple*'
Find rows that contain words such as “apple”, “apples”, “applesauce”, or “applet”.
'"some words"'
Find rows that contain the exact phrase “some words” (for example, rows that contain “some words of wisdom” but not “some noise words”). Note that the
"
characters that enclose the phrase are operator characters that delimit the phrase. They are not the quotation marks that enclose the search string itself.
InnoDB
full-text search is
modeled on the
Sphinx full-text
search engine, and the algorithms used are based on
BM25
and
TF-IDF
ranking algorithms. For these reasons, relevancy rankings for
InnoDB
boolean full-text search may differ
from MyISAM
relevancy rankings.
InnoDB
uses a variation of the “term
frequency-inverse document frequency”
(TF-IDF
) weighting system to rank a
document's relevance for a given full-text search query. The
TF-IDF
weighting is based on how frequently
a word appears in a document, offset by how frequently the
word appears in all documents in the collection. In other
words, the more frequently a word appears in a document, and
the less frequently the word appears in the document
collection, the higher the document is ranked.
How Relevancy Ranking is Calculated
The term frequency (TF
) value is the number
of times that a word appears in a document. The inverse
document frequency (IDF
) value of a word is
calculated using the following formula, where
total_records
is the number of records in
the collection, and matching_records
is the
number of records that the search term appears in.
${IDF} = log10( ${total_records} / ${matching_records} )
When a document contains a word multiple times, the IDF value is multiplied by the TF value:
${TF} * ${IDF}
Using the TF
and IDF
values, the relevancy ranking for a document is calculated
using this formula:
${rank} = ${TF} * ${IDF} * ${IDF}
The formula is demonstrated in the following examples.
Relevancy Ranking for a Single Word Search
This example demonstrates the relevancy ranking calculation for a single-word search.
mysql> CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) ) ENGINE=InnoDB; Query OK, 0 rows affected (1.04 sec) mysql> INSERT INTO articles (title,body) VALUES ('MySQL Tutorial','This database tutorial ...'), ("How To Use MySQL",'After you went through a ...'), ('Optimizing Your Database','In this database tutorial ...'), ('MySQL vs. YourSQL','When comparing databases ...'), ('MySQL Security','When configured properly, MySQL ...'), ('Database, Database, Database','database database database'), ('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'), ('MySQL Full-Text Indexes', 'MySQL fulltext indexes use a ..'); Query OK, 8 rows affected (0.06 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql> SELECT id, title, body, MATCH (title,body) AGAINST ('database' IN BOOLEAN MODE) AS score FROM articles ORDER BY score DESC; +----+------------------------------+-------------------------------------+---------------------+ | id | title | body | score | +----+------------------------------+-------------------------------------+---------------------+ | 6 | Database, Database, Database | database database database | 1.0886961221694946 | | 3 | Optimizing Your Database | In this database tutorial ... | 0.36289870738983154 | | 1 | MySQL Tutorial | This database tutorial ... | 0.18144935369491577 | | 2 | How To Use MySQL | After you went through a ... | 0 | | 4 | MySQL vs. YourSQL | When comparing databases ... | 0 | | 5 | MySQL Security | When configured properly, MySQL ... | 0 | | 7 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | 0 | | 8 | MySQL Full-Text Indexes | MySQL fulltext indexes use a .. | 0 | +----+------------------------------+-------------------------------------+---------------------+ 8 rows in set (0.00 sec)
There are 8 records in total, with 3 that match the
“database” search term. The first record
(id 6
) contains the search term 6 times and
has a relevancy ranking of
1.0886961221694946
. This ranking value is
calculated using a TF
value of 6 (the
“database” search term appears 6 times in record
id 6
) and an IDF
value
of 0.42596873216370745, which is calculated as follows (where
8 is the total number of records and 3 is the number of
records that the search term appears in):
${IDF} = log10( 8 / 3 ) = 0.42596873216370745
The TF
and IDF
values
are then entered into the ranking formula:
${rank} = ${TF} * ${IDF} * ${IDF}
Performing the calculation in the MySQL command-line client returns a ranking value of 1.088696164686938.
mysql> SELECT 6*log10(8/3)*log10(8/3); +-------------------------+ | 6*log10(8/3)*log10(8/3) | +-------------------------+ | 1.088696164686938 | +-------------------------+ 1 row in set (0.00 sec)
You may notice a slight difference in the ranking values
returned by the SELECT ... MATCH ...
AGAINST
statement and the MySQL command-line
client (1.0886961221694946
versus
1.088696164686938
). The difference is due
to how the casts between integers and floats/doubles are
performed internally by InnoDB
(along
with related precision and rounding decisions), and how they
are performed elsewhere, such as in the MySQL command-line
client or other types of calculators.
Relevancy Ranking for a Multiple Word Search
This example demonstrates the relevancy ranking calculation
for a multiple-word full-text search based on the
articles
table and data used in the
previous example.
If you search on more than one word, the relevancy ranking value is a sum of the relevancy ranking value for each word, as shown in this formula:
${rank} = ${TF} * ${IDF} * ${IDF} + ${TF} * ${IDF} * ${IDF}
Performing a search on two terms ('mysql tutorial') returns the following results:
mysql> SELECT id, title, body, MATCH (title,body) AGAINST ('mysql tutorial' IN BOOLEAN MODE) AS score FROM articles ORDER BY score DESC; +----+------------------------------+-------------------------------------+----------------------+ | id | title | body | score | +----+------------------------------+-------------------------------------+----------------------+ | 1 | MySQL Tutorial | This database tutorial ... | 0.7405621409416199 | | 3 | Optimizing Your Database | In this database tutorial ... | 0.3624762296676636 | | 5 | MySQL Security | When configured properly, MySQL ... | 0.031219376251101494 | | 8 | MySQL Full-Text Indexes | MySQL fulltext indexes use a .. | 0.031219376251101494 | | 2 | How To Use MySQL | After you went through a ... | 0.015609688125550747 | | 4 | MySQL vs. YourSQL | When comparing databases ... | 0.015609688125550747 | | 7 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | 0.015609688125550747 | | 6 | Database, Database, Database | database database database | 0 | +----+------------------------------+-------------------------------------+----------------------+ 8 rows in set (0.00 sec)
In the first record (id 8
), 'mysql' appears
once and 'tutorial' appears twice. There are six matching
records for 'mysql' and two matching records for 'tutorial'.
The MySQL command-line client returns the expected ranking
value when inserting these values into the ranking formula for
a multiple word search:
mysql> SELECT (1*log10(8/6)*log10(8/6)) + (2*log10(8/2)*log10(8/2)); +-------------------------------------------------------+ | (1*log10(8/6)*log10(8/6)) + (2*log10(8/2)*log10(8/2)) | +-------------------------------------------------------+ | 0.7405621541938003 | +-------------------------------------------------------+ 1 row in set (0.00 sec)
The slight difference in the ranking values returned by the
SELECT ... MATCH ... AGAINST
statement
and the MySQL command-line client is explained in the
preceding example.
Full-text search supports query expansion (and in particular, its variant “blind query expansion”). This is generally useful when a search phrase is too short, which often means that the user is relying on implied knowledge that the full-text search engine lacks. For example, a user searching for “database” may really mean that “MySQL”, “Oracle”, “DB2”, and “RDBMS” all are phrases that should match “databases” and should be returned, too. This is implied knowledge.
Blind query expansion (also known as automatic relevance
feedback) is enabled by adding WITH QUERY
EXPANSION
or IN NATURAL LANGUAGE MODE WITH
QUERY EXPANSION
following the search phrase. It works
by performing the search twice, where the search phrase for the
second search is the original search phrase concatenated with
the few most highly relevant documents from the first search.
Thus, if one of these documents contains the word
“databases” and the word “MySQL”, the
second search finds the documents that contain the word
“MySQL” even if they do not contain the word
“database”. The following example shows this
difference:
mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+ | id | title | body | +----+-------------------+------------------------------------------+ | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 5 | MySQL vs. YourSQL | In the following database comparison ... | +----+-------------------+------------------------------------------+ 2 rows in set (0.00 sec) mysql>SELECT * FROM articles
WHERE MATCH (title,body)
AGAINST ('database' WITH QUERY EXPANSION);
+----+-----------------------+------------------------------------------+ | id | title | body | +----+-----------------------+------------------------------------------+ | 5 | MySQL vs. YourSQL | In the following database comparison ... | | 1 | MySQL Tutorial | DBMS stands for DataBase ... | | 3 | Optimizing MySQL | In this tutorial we show ... | | 6 | MySQL Security | When configured properly, MySQL ... | | 2 | How To Use MySQL Well | After you went through a ... | | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... | +----+-----------------------+------------------------------------------+ 6 rows in set (0.00 sec)
Another example could be searching for books by Georges Simenon about Maigret, when a user is not sure how to spell “Maigret”. A search for “Megre and the reluctant witnesses” finds only “Maigret and the Reluctant Witnesses” without query expansion. A search with query expansion finds all books with the word “Maigret” on the second pass.
Because blind query expansion tends to increase noise significantly by returning nonrelevant documents, use it only when a search phrase is short.
The stopword list is loaded and searched for full-text queries
using the server character set and collation (the values of the
character_set_server
and
collation_server
system
variables). False hits or misses might occur for stopword
lookups if the stopword file or columns used for full-text
indexing or searches have a character set or collation different
from character_set_server
or
collation_server
.
Case sensitivity of stopword lookups depends on the server
collation. For example, lookups are case-insensitive if the
collation is utf8mb4_0900_ai_ci
, whereas
lookups are case-sensitive if the collation is
utf8mb4_0900_as_cs
or
utf8mb4_bin
.
InnoDB
has a relatively short list of
default stopwords, because documents from technical, literary,
and other sources often use short words as keywords or in
significant phrases. For example, you might search for
“to be or not to be” and expect to get a sensible
result, rather than having all those words ignored.
To see the default InnoDB
stopword list,
query the
INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD
table.
mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD; +-------+ | value | +-------+ | a | | about | | an | | are | | as | | at | | be | | by | | com | | de | | en | | for | | from | | how | | i | | in | | is | | it | | la | | of | | on | | or | | that | | the | | this | | to | | was | | what | | when | | where | | who | | will | | with | | und | | the | | www | +-------+ 36 rows in set (0.00 sec)
To define your own stopword list for all
InnoDB
tables, define a table with the same
structure as the
INNODB_FT_DEFAULT_STOPWORD
table,
populate it with stopwords, and set the value of the
innodb_ft_server_stopword_table
option to a value in the form
before creating the full-text index. The stopword table must
have a single db_name
/table_name
VARCHAR
column
named value
. The following example
demonstrates creating and configuring a new global stopword
table for InnoDB
.
-- Create a new stopword table mysql> CREATE TABLE my_stopwords(value VARCHAR(30)) ENGINE = INNODB; Query OK, 0 rows affected (0.01 sec) -- Insert stopwords (for simplicity, a single stopword is used in this example) mysql> INSERT INTO my_stopwords(value) VALUES ('Ishmael'); Query OK, 1 row affected (0.00 sec) -- Create the table mysql> CREATE TABLE opening_lines ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, opening_line TEXT(500), author VARCHAR(200), title VARCHAR(200) ) ENGINE=InnoDB; Query OK, 0 rows affected (0.01 sec) -- Insert data into the table mysql> INSERT INTO opening_lines(opening_line,author,title) VALUES ('Call me Ishmael.','Herman Melville','Moby-Dick'), ('A screaming comes across the sky.','Thomas Pynchon','Gravity\'s Rainbow'), ('I am an invisible man.','Ralph Ellison','Invisible Man'), ('Where now? Who now? When now?','Samuel Beckett','The Unnamable'), ('It was love at first sight.','Joseph Heller','Catch-22'), ('All this happened, more or less.','Kurt Vonnegut','Slaughterhouse-Five'), ('Mrs. Dalloway said she would buy the flowers herself.','Virginia Woolf','Mrs. Dalloway'), ('It was a pleasure to burn.','Ray Bradbury','Fahrenheit 451'); Query OK, 8 rows affected (0.00 sec) Records: 8 Duplicates: 0 Warnings: 0 -- Set the innodb_ft_server_stopword_table option to the new stopword table mysql> SET GLOBAL innodb_ft_server_stopword_table = 'test/my_stopwords'; Query OK, 0 rows affected (0.00 sec) -- Create the full-text index (which rebuilds the table if no FTS_DOC_ID column is defined) mysql> CREATE FULLTEXT INDEX idx ON opening_lines(opening_line); Query OK, 0 rows affected, 1 warning (1.17 sec) Records: 0 Duplicates: 0 Warnings: 1
Verify that the specified stopword ('Ishmael') does not appear
by querying the words in
INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE
.
By default, words less than 3 characters in length or
greater than 84 characters in length do not appear in an
InnoDB
full-text search index. Maximum
and minimum word length values are configurable using the
innodb_ft_max_token_size
and
innodb_ft_min_token_size
variables. This default behavior does not apply to the ngram
parser plugin. ngram token size is defined by the
ngram_token_size
option.
mysql> SET GLOBAL innodb_ft_aux_table='test/opening_lines'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT word FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE LIMIT 15; +-----------+ | word | +-----------+ | across | | all | | burn | | buy | | call | | comes | | dalloway | | first | | flowers | | happened | | herself | | invisible | | less | | love | | man | +-----------+ 15 rows in set (0.00 sec)
To create stopword lists on a table-by-table basis, create
other stopword tables and use the
innodb_ft_user_stopword_table
option to specify the stopword table that you want to use
before you create the full-text index.
The stopword file is loaded and searched using
latin1
if
character_set_server
is
ucs2
, utf16
,
utf16le
, or utf32
.
To override the default stopword list for MyISAM tables, set
the ft_stopword_file
system
variable. (See Section 5.1.8, “Server System Variables”.) The
variable value should be the path name of the file containing
the stopword list, or the empty string to disable stopword
filtering. The server looks for the file in the data directory
unless an absolute path name is given to specify a different
directory. After changing the value of this variable or the
contents of the stopword file, restart the server and rebuild
your FULLTEXT
indexes.
The stopword list is free-form, separating stopwords with any
nonalphanumeric character such as newline, space, or comma.
Exceptions are the underscore character (_
)
and a single apostrophe ('
) which are
treated as part of a word. The character set of the stopword
list is the server's default character set; see
Section 10.3.2, “Server Character Set and Collation”.
The following list shows the default stopwords for
MyISAM
search indexes. In a MySQL source
distribution, you can find this list in the
storage/myisam/ft_static.c
file.
a's able about above according accordingly across actually after afterwards again against ain't all allow allows almost alone along already also although always am among amongst an and another any anybody anyhow anyone anything anyway anyways anywhere apart appear appreciate appropriate are aren't around as aside ask asking associated at available away awfully be became because become becomes becoming been before beforehand behind being believe below beside besides best better between beyond both brief but by c'mon c's came can can't cannot cant cause causes certain certainly changes clearly co com come comes concerning consequently consider considering contain containing contains corresponding could couldn't course currently definitely described despite did didn't different do does doesn't doing don't done down downwards during each edu eg eight either else elsewhere enough entirely especially et etc even ever every everybody everyone everything everywhere ex exactly example except far few fifth first five followed following follows for former formerly forth four from further furthermore get gets getting given gives go goes going gone got gotten greetings had hadn't happens hardly has hasn't have haven't having he he's hello help hence her here here's hereafter hereby herein hereupon hers herself hi him himself his hither hopefully how howbeit however i'd i'll i'm i've ie if ignored immediate in inasmuch inc indeed indicate indicated indicates inner insofar instead into inward is isn't it it'd it'll it's its itself just keep keeps kept know known knows last lately later latter latterly least less lest let let's like liked likely little look looking looks ltd mainly many may maybe me mean meanwhile merely might more moreover most mostly much must my myself name namely nd near nearly necessary need needs neither never nevertheless new next nine no nobody non none noone nor normally not nothing novel now nowhere obviously of off often oh ok okay old on once one ones only onto or other others otherwise ought our ours ourselves out outside over overall own particular particularly per perhaps placed please plus possible presumably probably provides que quite qv rather rd re really reasonably regarding regardless regards relatively respectively right said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several shall she should shouldn't since six so some somebody somehow someone something sometime sometimes somewhat somewhere soon sorry specified specify specifying still sub such sup sure t's take taken tell tends th than thank thanks thanx that that's thats the their theirs them themselves then thence there there's thereafter thereby therefore therein theres thereupon these they they'd they'll they're they've think third this thorough thoroughly those though three through throughout thru thus to together too took toward towards tried tries truly try trying twice two un under unfortunately unless unlikely until unto up upon us use used useful uses using usually value various very via viz vs want wants was wasn't way we we'd we'll we're we've welcome well went were weren't what what's whatever when whence whenever where where's whereafter whereas whereby wherein whereupon wherever whether which while whither who who's whoever whole whom whose why will willing wish with within without won't wonder would wouldn't yes yet you you'd you'll you're you've your yours yourself yourselves zero
Full-text searches are supported for
InnoDB
andMyISAM
tables only.Full-text searches are not supported for partitioned tables. See Section 24.6, “Restrictions and Limitations on Partitioning”.
Full-text searches can be used with most multibyte character sets. The exception is that for Unicode, the
utf8
character set can be used, but not theucs2
character set. AlthoughFULLTEXT
indexes onucs2
columns cannot be used, you can performIN BOOLEAN MODE
searches on aucs2
column that has no such index.The remarks for
utf8
also apply toutf8mb4
, and the remarks forucs2
also apply toutf16
,utf16le
, andutf32
.Ideographic languages such as Chinese and Japanese do not have word delimiters. Therefore, the built-in full-text parser cannot determine where words begin and end in these and other such languages.
A character-based ngram full-text parser that supports Chinese, Japanese, and Korean (CJK), and a word-based MeCab parser plugin that supports Japanese are provided for use with
InnoDB
andMyISAM
tables.Although the use of multiple character sets within a single table is supported, all columns in a
FULLTEXT
index must use the same character set and collation.The
MATCH()
column list must match exactly the column list in someFULLTEXT
index definition for the table, unless thisMATCH()
isIN BOOLEAN MODE
on aMyISAM
table. ForMyISAM
tables, boolean-mode searches can be done on nonindexed columns, although they are likely to be slow.The argument to
AGAINST()
must be a string value that is constant during query evaluation. This rules out, for example, a table column because that can differ for each row.Index hints are more limited for
FULLTEXT
searches than for non-FULLTEXT
searches. See Section 8.9.4, “Index Hints”.For
InnoDB
, all DML operations (INSERT
,UPDATE
,DELETE
) involving columns with full-text indexes are processed at transaction commit time. For example, for anINSERT
operation, an inserted string is tokenized and decomposed into individual words. The individual words are then added to full-text index tables when the transaction is committed. As a result, full-text searches only return committed data.The '%' character is not a supported wildcard character for full-text searches.
MySQL's full-text search capability has few user-tunable parameters. You can exert more control over full-text searching behavior if you have a MySQL source distribution because some changes require source code modifications. See Section 2.9, “Installing MySQL from Source”.
Full-text search is carefully tuned for effectiveness. Modifying the default behavior in most cases can actually decrease effectiveness. Do not alter the MySQL sources unless you know what you are doing.
Most full-text variables described in this section must be set at server startup time. A server restart is required to change them; they cannot be modified while the server is running.
Some variable changes require that you rebuild the
FULLTEXT
indexes in your tables. Instructions
for doing so are given later in this section.
The minimum and maximum lengths of words to be indexed are
defined by the
innodb_ft_min_token_size
and
innodb_ft_max_token_size
for
InnoDB
search indexes, and
ft_min_word_len
and
ft_max_word_len
for
MyISAM
ones.
Minimum and maximum word length full-text parameters do not
apply to FULLTEXT
indexes created using
the ngram parser. ngram token size is defined by the
ngram_token_size
option.
After changing any of these options, rebuild your
FULLTEXT
indexes for the change to take
effect. For example, to make two-character words searchable,
you could put the following lines in an option file:
[mysqld] innodb_ft_min_token_size=2 ft_min_word_len=2
Then restart the server and rebuild your
FULLTEXT
indexes. For
MyISAM
tables, note the remarks regarding
myisamchk in the instructions that follow
for rebuilding MyISAM
full-text indexes.
For MyISAM
search indexes, the 50%
threshold for natural language searches is determined by the
particular weighting scheme chosen. To disable it, look for
the following line in
storage/myisam/ftdefs.h
:
#define GWS_IN_USE GWS_PROB
Change that line to this:
#define GWS_IN_USE GWS_FREQ
Then recompile MySQL. There is no need to rebuild the indexes in this case.
By making this change, you severely
decrease MySQL's ability to provide adequate relevance
values for the MATCH()
function. If you really need to search for such common
words, it would be better to search using IN
BOOLEAN MODE
instead, which does not observe the
50% threshold.
To change the operators used for boolean full-text searches on
MyISAM
tables, set the
ft_boolean_syntax
system
variable. (InnoDB
does not have an
equivalent setting.) This variable can be changed while the
server is running, but you must have privileges sufficient to
set global system variables (see
Section 5.1.9.1, “System Variable Privileges”). No rebuilding
of indexes is necessary in this case.
For the built-in full-text parser, you can change the set of
characters that are considered word characters in several
ways, as described in the following list. After making the
modification, rebuild the indexes for each table that contains
any FULLTEXT
indexes. Suppose that you want
to treat the hyphen character ('-') as a word character. Use
one of these methods:
Modify the MySQL source: In
storage/innobase/handler/ha_innodb.cc
(forInnoDB
), or instorage/myisam/ftdefs.h
(forMyISAM
), see thetrue_word_char()
andmisc_word_char()
macros. Add'-'
to one of those macros and recompile MySQL.Modify a character set file: This requires no recompilation. The
true_word_char()
macro uses a “character type” table to distinguish letters and numbers from other characters. . You can edit the contents of the<ctype><map>
array in one of the character set XML files to specify that'-'
is a “letter.” Then use the given character set for yourFULLTEXT
indexes. For information about the<ctype><map>
array format, see Section 10.13.1, “Character Definition Arrays”.Add a new collation for the character set used by the indexed columns, and alter the columns to use that collation. For general information about adding collations, see Section 10.14, “Adding a Collation to a Character Set”. For an example specific to full-text indexing, see Section 12.10.7, “Adding a User-Defined Collation for Full-Text Indexing”.
For the changes to take effect, FULLTEXT
indexes must be rebuilt after modifying any of the following
full-text index variables:
innodb_ft_min_token_size
;
innodb_ft_max_token_size
;
innodb_ft_server_stopword_table
;
innodb_ft_user_stopword_table
;
innodb_ft_enable_stopword
;
ngram_token_size
. Modifying
innodb_ft_min_token_size
,
innodb_ft_max_token_size
, or
ngram_token_size
requires
restarting the server.
To rebuild FULLTEXT
indexes for an
InnoDB
table, use
ALTER TABLE
with the
DROP INDEX
and ADD INDEX
options to drop and re-create each index.
Running OPTIMIZE TABLE
on a
table with a full-text index rebuilds the full-text index,
removing deleted Document IDs and consolidating multiple
entries for the same word, where possible.
To optimize a full-text index, enable
innodb_optimize_fulltext_only
and run OPTIMIZE TABLE
.
mysql> set GLOBAL innodb_optimize_fulltext_only=ON; Query OK, 0 rows affected (0.01 sec) mysql> OPTIMIZE TABLE opening_lines; +--------------------+----------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------------------+----------+----------+----------+ | test.opening_lines | optimize | status | OK | +--------------------+----------+----------+----------+ 1 row in set (0.01 sec)
To avoid lengthy rebuild times for full-text indexes on large
tables, you can use the
innodb_ft_num_word_optimize
option to perform the optimization in stages. The
innodb_ft_num_word_optimize
option defines
the number of words that are optimized each time
OPTIMIZE TABLE
is run. The
default setting is 2000, which means that 2000 words are
optimized each time OPTIMIZE
TABLE
is run. Subsequent
OPTIMIZE TABLE
operations
continue from where the preceding
OPTIMIZE TABLE
operation ended.
If you modify full-text variables that affect indexing
(ft_min_word_len
,
ft_max_word_len
, or
ft_stopword_file
), or if you
change the stopword file itself, you must rebuild your
FULLTEXT
indexes after making the changes
and restarting the server.
To rebuild the FULLTEXT
indexes for a
MyISAM
table, it is sufficient to do a
QUICK
repair operation:
mysql> REPAIR TABLE tbl_name
QUICK;
Alternatively, use ALTER TABLE
as just described. In some cases, this may be faster than a
repair operation.
Each table that contains any FULLTEXT
index
must be repaired as just shown. Otherwise, queries for the
table may yield incorrect results, and modifications to the
table causes the server to see the table as corrupt and in
need of repair.
If you use myisamchk to perform an
operation that modifies MyISAM
table
indexes (such as repair or analyze), the
FULLTEXT
indexes are rebuilt using the
default full-text parameter values for
minimum word length, maximum word length, and stopword file
unless you specify otherwise. This can result in queries
failing.
The problem occurs because these parameters are known only by
the server. They are not stored in MyISAM
index files. To avoid the problem if you have modified the
minimum or maximum word length or stopword file values used by
the server, specify the same
ft_min_word_len
,
ft_max_word_len
, and
ft_stopword_file
values for
myisamchk that you use for
mysqld. For example, if you have set the
minimum word length to 3, you can repair a table with
myisamchk like this:
myisamchk --recover --ft_min_word_len=3 tbl_name
.MYI
To ensure that myisamchk and the server use
the same values for full-text parameters, place each one in
both the [mysqld]
and
[myisamchk]
sections of an option file:
[mysqld] ft_min_word_len=3 [myisamchk] ft_min_word_len=3
An alternative to using myisamchk for
MyISAM
table index modification is to use
the REPAIR TABLE
,
ANALYZE TABLE
,
OPTIMIZE TABLE
, or
ALTER TABLE
statements. These
statements are performed by the server, which knows the proper
full-text parameter values to use.
This section describes how to add a user-defined collation for
full-text searches using the built-in full-text parser. The
sample collation is like latin1_swedish_ci
but treats the '-'
character as a letter
rather than as a punctuation character so that it can be indexed
as a word character. General information about adding collations
is given in Section 10.14, “Adding a Collation to a Character Set”; it is assumed
that you have read it and are familiar with the files involved.
To add a collation for full-text indexing, use the following procedure. The instructions here add a collation for a simple character set, which as discussed in Section 10.14, “Adding a Collation to a Character Set”, can be created using a configuration file that describes the character set properties. For a complex character set such as Unicode, create collations using C source files that describe the character set properties.
Add a collation to the
Index.xml
file. The permitted range of IDs for user-defined collations is given in Section 10.14.2, “Choosing a Collation ID”. The ID must be unused, so choose a value different from 1025 if that ID is already taken on your system.<charset name="latin1"> ... <collation name="latin1_fulltext_ci" id="1025"/> </charset>
Declare the sort order for the collation in the
latin1.xml
file. In this case, the order can be copied fromlatin1_swedish_ci
:<collation name="latin1_fulltext_ci"> <map> 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 7B 7C 7D 7E 7F 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF 41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49 44 4E 4F 4F 4F 4F 5D D7 D8 55 55 55 59 59 DE DF 41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49 44 4E 4F 4F 4F 4F 5D F7 D8 55 55 55 59 59 DE FF </map> </collation>
Modify the
ctype
array inlatin1.xml
. Change the value corresponding to 0x2D (which is the code for the'-'
character) from 10 (punctuation) to 01 (uppercase letter). In the following array, this is the element in the fourth row down, third value from the end.<ctype> <map> 00 20 20 20 20 20 20 20 20 20 28 28 28 28 28 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 48 10 10 10 10 10 10 10 10 10 10 10 10
01
10 10 84 84 84 84 84 84 84 84 84 84 10 10 10 10 10 10 10 81 81 81 81 81 81 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 10 10 10 10 10 10 82 82 82 82 82 82 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 10 10 10 10 20 10 00 10 02 10 10 10 10 10 10 01 10 01 00 01 00 00 10 10 10 10 10 10 10 10 10 02 10 02 00 02 01 48 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 10 01 01 01 01 01 01 01 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 10 02 02 02 02 02 02 02 02 </map> </ctype>Restart the server.
To employ the new collation, include it in the definition of columns that are to use it:
mysql>
DROP TABLE IF EXISTS t1;
Query OK, 0 rows affected (0.13 sec) mysql>CREATE TABLE t1 (
a TEXT CHARACTER SET latin1 COLLATE latin1_fulltext_ci,
FULLTEXT INDEX(a)
) ENGINE=InnoDB;
Query OK, 0 rows affected (0.47 sec)Test the collation to verify that hyphen is considered as a word character:
mysql>
INSERT INTO t1 VALUEs ('----'),('....'),('abcd');
Query OK, 3 rows affected (0.22 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql>SELECT * FROM t1 WHERE MATCH a AGAINST ('----' IN BOOLEAN MODE);
+------+ | a | +------+ | ---- | +------+ 1 row in set (0.00 sec)
The built-in MySQL full-text parser uses the white space between
words as a delimiter to determine where words begin and end,
which is a limitation when working with ideographic languages
that do not use word delimiters. To address this limitation,
MySQL provides an ngram full-text parser that supports Chinese,
Japanese, and Korean (CJK). The ngram full-text parser is
supported for use with InnoDB
and
MyISAM
.
MySQL also provides a MeCab full-text parser plugin for Japanese, which tokenizes documents into meaningful words. For more information, see Section 12.10.9, “MeCab Full-Text Parser Plugin”.
An ngram is a contiguous sequence of
n
characters from a given sequence of
text. The ngram parser tokenizes a sequence of text into a
contiguous sequence of n
characters.
For example, you can tokenize “abcd” for different
values of n
using the ngram full-text
parser.
n=1: 'a', 'b', 'c', 'd' n=2: 'ab', 'bc', 'cd' n=3: 'abc', 'bcd' n=4: 'abcd'
The ngram full-text parser is a built-in server plugin. As with other built-in server plugins, it is automatically loaded when the server is started.
The full-text search syntax described in
Section 12.10, “Full-Text Search Functions” applies to the ngram parser
plugin. Differences in parsing behavior are described in this
section. Full-text-related configuration options, except for
minimum and maximum word length options
(innodb_ft_min_token_size
,
innodb_ft_max_token_size
,
ft_min_word_len
,
ft_max_word_len
) are also
applicable.
Configuring ngram Token Size
The ngram parser has a default ngram token size of 2 (bigram). For example, with a token size of 2, the ngram parser parses the string “abc def” into four tokens: “ab”, “bc”, “de” and “ef”.
ngram token size is configurable using the
ngram_token_size
configuration
option, which has a minimum value of 1 and maximum value of 10.
Typically, ngram_token_size
is
set to the size of the largest token that you want to search
for. If you only intend to search for single characters, set
ngram_token_size
to 1. A
smaller token size produces a smaller full-text search index,
and faster searches. If you need to search for words comprised
of more than one character, set
ngram_token_size
accordingly.
For example, “Happy Birthday” is
“生日快乐” in
simplified Chinese, where
“生日” is
“birthday”, and
“快乐” translates
as “happy”. To search on two-character words such
as these, set ngram_token_size
to a value of 2 or higher.
As a read-only variable,
ngram_token_size
may only be
set as part of a startup string or in a configuration file:
Startup string:
mysqld --ngram_token_size=2
Configuration file:
[mysqld] ngram_token_size=2
The following minimum and maximum word length configuration
options are ignored for FULLTEXT
indexes
that use the ngram parser:
innodb_ft_min_token_size
,
innodb_ft_max_token_size
,
ft_min_word_len
, and
ft_max_word_len
.
Creating a FULLTEXT Index that Uses the ngram Parser
To create a FULLTEXT
index that uses the
ngram parser, specify WITH PARSER ngram
with
CREATE TABLE
,
ALTER TABLE
, or
CREATE INDEX
.
The following example demonstrates creating a table with an
ngram
FULLTEXT
index,
inserting sample data (Simplified Chinese text), and viewing
tokenized data in the
INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE
table.
mysql> USE test; mysql> CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) WITH PARSER ngram ) ENGINE=InnoDB CHARACTER SET utf8mb4; mysql> SET NAMES utf8mb4; INSERT INTO articles (title,body) VALUES ('数据库管理','在本教程中我将向你展示如何管理数据库'), ('数据库应用开发','学习开发数据库应用程序'); mysql> SET GLOBAL innodb_ft_aux_table="test/articles"; mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE ORDER BY doc_id, position;
To add a FULLTEXT
index to an existing table,
you can use ALTER TABLE
or
CREATE INDEX
. For example:
CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT ) ENGINE=InnoDB CHARACTER SET utf8; ALTER TABLE articles ADD FULLTEXT INDEX ft_index (title,body) WITH PARSER ngram; # Or: CREATE FULLTEXT INDEX ft_index ON articles (title,body) WITH PARSER ngram;
ngram Parser Space Handling
The ngram parser eliminates spaces when parsing. For example:
“ab cd” is parsed to “ab”, “cd”
“a bc” is parsed to “bc”
ngram Parser Stopword Handling
The built-in MySQL full-text parser compares words to entries in
the stopword list. If a word is equal to an entry in the
stopword list, the word is excluded from the index. For the
ngram parser, stopword handling is performed differently.
Instead of excluding tokens that are equal to entries in the
stopword list, the ngram parser excludes tokens that
contain stopwords. For example, assuming
ngram_token_size=2
, a document
that contains “a,b” is parsed to “a,”
and “,b”. If a comma (“,”) is defined
as a stopword, both “a,” and “,b” are
excluded from the index because they contain a comma.
By default, the ngram parser uses the default stopword list, which contains a list of English stopwords. For a stopword list applicable to Chinese, Japanese, or Korean, you must create your own. For information about creating a stopword list, see Section 12.10.4, “Full-Text Stopwords”.
Stopwords greater in length than
ngram_token_size
are ignored.
ngram Parser Term Search
For natural language mode search, the
search term is converted to a union of ngram terms. For example,
the string “abc” (assuming
ngram_token_size=2
) is
converted to “ab bc”. Given two documents, one
containing “ab” and the other containing
“abc”, the search term “ab bc” matches
both documents.
For boolean mode search, the search term is
converted to an ngram phrase search. For example, the string
'abc' (assuming
ngram_token_size=2
) is
converted to '“ab bc”'. Given two documents, one
containing 'ab' and the other containing 'abc', the search
phrase '“ab bc”' only matches the document
containing 'abc'.
ngram Parser Wildcard Search
Because an ngram FULLTEXT
index contains only
ngrams, and does not contain information about the beginning of
terms, wildcard searches may return unexpected results. The
following behaviors apply to wildcard searches using ngram
FULLTEXT
search indexes:
If the prefix term of a wildcard search is shorter than ngram token size, the query returns all indexed rows that contain ngram tokens starting with the prefix term. For example, assuming
ngram_token_size=2
, a search on “a*” returns all rows starting with “a”.If the prefix term of a wildcard search is longer than ngram token size, the prefix term is converted to an ngram phrase and the wildcard operator is ignored. For example, assuming
ngram_token_size=2
, an “abc*” wildcard search is converted to “ab bc”.
ngram Parser Phrase Search
Phrase searches are converted to ngram phrase searches. For example, The search phrase “abc” is converted to “ab bc”, which returns documents containing “abc” and “ab bc”.
The search phrase “abc def” is converted to “ab bc de ef”, which returns documents containing “abc def” and “ab bc de ef”. A document that contains “abcdef” is not returned.
The built-in MySQL full-text parser uses the white space between
words as a delimiter to determine where words begin and end,
which is a limitation when working with ideographic languages
that do not use word delimiters. To address this limitation for
Japanese, MySQL provides a MeCab full-text parser plugin. The
MeCab full-text parser plugin is supported for use with
InnoDB
and
MyISAM
.
MySQL also provides an ngram full-text parser plugin that supports Japanese. For more information, see Section 12.10.8, “ngram Full-Text Parser”.
The MeCab full-text parser plugin is a full-text parser plugin
for Japanese that tokenizes a sequence of text into meaningful
words. For example, MeCab tokenizes
“データベース管理”
(“Database Management”) into
“データベース”
(“Database”) and
“管理”
(“Management”). By comparison, the
ngram full-text
parser tokenizes text into a contiguous sequence of
n
characters, where
n
represents a number between 1 and
10.
In addition to tokenizing text into meaningful words, MeCab indexes are typically smaller than ngram indexes, and MeCab full-text searches are generally faster. One drawback is that it may take longer for the MeCab full-text parser to tokenize documents, compared to the ngram full-text parser.
The full-text search syntax described in Section 12.10, “Full-Text Search Functions” applies to the MeCab parser plugin. Differences in parsing behavior are described in this section. Full-text related configuration options are also applicable.
For additional information about the MeCab parser, refer to the MeCab: Yet Another Part-of-Speech and Morphological Analyzer project on Github.
Installing the MeCab Parser Plugin
The MeCab parser plugin requires mecab
and
mecab-ipadic
.
On supported Fedora, Debian and Ubuntu platforms (except Ubuntu
12.04 where the system mecab
version is too
old), MySQL dynamically links to the system
mecab
installation if it is installed to
the default location. On other supported Unix-like platforms,
libmecab.so
is statically linked in
libpluginmecab.so
, which is located in the
MySQL plugin directory. mecab-ipadic
is
included in MySQL binaries and is located in
.
MYSQL_HOME
\lib\mecab
You can install mecab
and
mecab-ipadic
using a native package
management utility (on Fedora, Debian, and Ubuntu), or you can
build mecab
and
mecab-ipadic
from source. For information
about installing mecab
and
mecab-ipadic
using a native package
management utility, see
Installing MeCab From a
Binary Distribution (Optional). If you want to build
mecab
and mecab-ipadic
from source, see
Building MeCab From
Source (Optional).
On Windows, libmecab.dll
is found in the
MySQL bin
directory.
mecab-ipadic
is located in
.
MYSQL_HOME
/lib/mecab
To install and configure the MeCab parser plugin, perform the following steps:
In the MySQL configuration file, set the
mecab_rc_file
configuration option to the location of themecabrc
configuration file, which is the configuration file for MeCab. If you are using the MeCab package distributed with MySQL, themecabrc
file is located inMYSQL_HOME/lib/mecab/etc/
.[mysqld] loose-mecab-rc-file=MYSQL_HOME/lib/mecab/etc/mecabrc
The
loose
prefix is an option modifier. Themecab_rc_file
option is not recognized by MySQL until the MeCaB parser plugin is installed but it must be set before attempting to install the MeCaB parser plugin. Theloose
prefix allows you restart MySQL without encountering an error due to an unrecognized variable.If you use your own MeCab installation, or build MeCab from source, the location of the
mecabrc
configuration file may differ.For information about the MySQL configuration file and its location, see Section 4.2.2.2, “Using Option Files”.
Also in the MySQL configuration file, set the minimum token size to 1 or 2, which are the values recommended for use with the MeCab parser. For
InnoDB
tables, minimum token size is defined by theinnodb_ft_min_token_size
configuration option, which has a default value of 3. ForMyISAM
tables, minimum token size is defined byft_min_word_len
, which has a default value of 4.[mysqld] innodb_ft_min_token_size=1
Modify the
mecabrc
configuration file to specify the dictionary you want to use. Themecab-ipadic
package distributed with MySQL binaries includes three dictionaries (ipadic_euc-jp
,ipadic_sjis
, andipadic_utf-8
). Themecabrc
configuration file packaged with MySQL contains and entry similar to the following:dicdir = /path/to/mysql/lib/mecab/lib/mecab/dic/ipadic_euc-jp
To use the
ipadic_utf-8
dictionary, for example, modify the entry as follows:dicdir=
MYSQL_HOME
/lib/mecab/dic/ipadic_utf-8If you are using your own MeCab installation or have built MeCab from source, the default
dicdir
entry in themecabrc
file is likely to differ, as are the dictionaries and their location.NoteAfter the MeCab parser plugin is installed, you can use the
mecab_charset
status variable to view the character set used with MeCab. The three MeCab dictionaries provided with the MySQL binary support the following character sets.The
ipadic_euc-jp
dictionary supports theujis
andeucjpms
character sets.The
ipadic_sjis
dictionary supports thesjis
andcp932
character sets.The
ipadic_utf-8
dictionary supports theutf8
andutf8mb4
character sets.
mecab_charset
only reports the first supported character set. For example, theipadic_utf-8
dictionary supports bothutf8
andutf8mb4
.mecab_charset
always reportsutf8
when this dictionary is in use.Restart MySQL.
Install the MeCab parser plugin:
The MeCab parser plugin is installed using
INSTALL PLUGIN
syntax. The plugin name ismecab
, and the shared library name islibpluginmecab.so
. For additional information about installing plugins, see Section 5.6.1, “Installing and Uninstalling Plugins”.INSTALL PLUGIN mecab SONAME 'libpluginmecab.so';
Once installed, the MeCab parser plugin loads at every normal MySQL restart.
Verify that the MeCab parser plugin is loaded using the
SHOW PLUGINS
statement.mysql> SHOW PLUGINS;
A
mecab
plugin should appear in the list of plugins.
Creating a FULLTEXT Index that uses the MeCab Parser
To create a FULLTEXT
index that uses the
mecab parser, specify WITH PARSER ngram
with
CREATE TABLE
,
ALTER TABLE
, or
CREATE INDEX
.
This example demonstrates creating a table with a
mecab
FULLTEXT
index,
inserting sample data, and viewing tokenized data in the
INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE
table:
mysql> USE test; mysql> CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) WITH PARSER mecab ) ENGINE=InnoDB CHARACTER SET utf8; mysql> SET NAMES utf8; mysql> INSERT INTO articles (title,body) VALUES ('データベース管理','このチュートリアルでは、私はどのようにデータベースを管理する方法を紹介します'), ('データベースアプリケーション開発','データベースアプリケーションを開発することを学ぶ'); mysql> SET GLOBAL innodb_ft_aux_table="test/articles"; mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE ORDER BY doc_id, position;
To add a FULLTEXT
index to an existing table,
you can use ALTER TABLE
or
CREATE INDEX
. For example:
CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT ) ENGINE=InnoDB CHARACTER SET utf8; ALTER TABLE articles ADD FULLTEXT INDEX ft_index (title,body) WITH PARSER mecab; # Or: CREATE FULLTEXT INDEX ft_index ON articles (title,body) WITH PARSER mecab;
MeCab Parser Space Handling
The MeCab parser uses spaces as separators in query strings. For example, the MeCab parser tokenizes データベース管理 as データベース and 管理.
MeCab Parser Stopword Handling
By default, the MeCab parser uses the default stopword list, which contains a short list of English stopwords. For a stopword list applicable to Japanese, you must create your own. For information about creating stopword lists, see Section 12.10.4, “Full-Text Stopwords”.
MeCab Parser Term Search
For natural language mode search, the search term is converted to a union of tokens. For example, データベース管理 is converted to データベース 管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('データベース管理' IN NATURAL LANGUAGE MODE);
For boolean mode search, the search term is converted to a search phrase. For example, データベース管理 is converted to データベース 管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('データベース管理' IN BOOLEAN MODE);
MeCab Parser Wildcard Search
Wildcard search terms are not tokenized. A search on データベース管理* is performed on the prefix, データベース管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('データベース*' IN BOOLEAN MODE);
MeCab Parser Phrase Search
Phrases are tokenized. For example, データベース管理 is tokenized as データベース 管理.
SELECT COUNT(*) FROM articles WHERE MATCH(title,body) AGAINST('"データベース管理"' IN BOOLEAN MODE);
Installing MeCab From a Binary Distribution (Optional)
This section describes how to install mecab
and mecab-ipadic
from a binary distribution
using a native package management utility. For example, on
Fedora, you can use Yum to perform the installation:
yum mecab-devel
On Debian or Ubuntu, you can perform an APT installation:
apt-get install mecab apt-get install mecab-ipadic
Installing MeCab From Source (Optional)
If you want to build mecab
and
mecab-ipadic
from source, basic
installation steps are provided below. For additional
information, refer to the MeCab documentation.
Download the tar.gz packages for
mecab
andmecab-ipadic
from http://taku910.github.io/mecab/#download. As of February, 2016, the latest available packages aremecab-0.996.tar.gz
andmecab-ipadic-2.7.0-20070801.tar.gz
.Install
mecab
:tar zxfv mecab-0.996.tar cd mecab-0.996 ./configure make make check su make install
Install
mecab-ipadic
:tar zxfv mecab-ipadic-2.7.0-20070801.tar cd mecab-ipadic-2.7.0-20070801 ./configure make su make install
Compile MySQL using the
WITH_MECAB
CMake option. Set theWITH_MECAB
option tosystem
if you have installedmecab
andmecab-ipadic
to the default location.-DWITH_MECAB=system
If you defined a custom installation directory, set
WITH_MECAB
to the custom directory. For example:-DWITH_MECAB=/path/to/mecab
Cast functions and operators enable conversion of values from one data type to another.
CONVERT()
with a
USING
clause converts data between different
character sets:
CONVERT(expr
USINGtranscoding_name
)
In MySQL, transcoding names are the same as the corresponding character set names.
Examples:
SELECT CONVERT('test' USING utf8mb4); SELECT CONVERT(_latin1'Müller' USING utf8mb4); INSERT INTO utf8mb4_table (utf8mb4_column) SELECT CONVERT(latin1_column USING utf8mb4) FROM latin1_table;
To convert strings between different character sets, you can also
use CONVERT(
syntax (without
expr
,
type
)USING
), or
CAST(
, which is equivalent:
expr
AS
type
)
CONVERT(string
, CHAR[(N
)] CHARACTER SETcharset_name
) CAST(string
AS CHAR[(N
)] CHARACTER SETcharset_name
)
Examples:
SELECT CONVERT('test', CHAR CHARACTER SET utf8mb4); SELECT CAST('test' AS CHAR CHARACTER SET utf8mb4);
If you specify CHARACTER SET
as just shown,
the character set and collation of the result are
charset_name
charset_name
and the default collation
of charset_name
. If you omit
CHARACTER SET
, the character
set and collation of the result are defined by the
charset_name
character_set_connection
and
collation_connection
system
variables that determine the default connection character set and
collation (see Section 10.4, “Connection Character Sets and Collations”).
A COLLATE
clause is not permitted within a
CONVERT()
or
CAST()
call, but you can apply it
to the function result. For example, these are legal:
SELECT CONVERT('test' USING utf8mb4) COLLATE utf8mb4_bin; SELECT CONVERT('test', CHAR CHARACTER SET utf8mb4) COLLATE utf8mb4_bin; SELECT CAST('test' AS CHAR CHARACTER SET utf8mb4) COLLATE utf8mb4_bin;
But these are illegal:
SELECT CONVERT('test' USING utf8mb4 COLLATE utf8mb4_bin); SELECT CONVERT('test', CHAR CHARACTER SET utf8mb4 COLLATE utf8mb4_bin); SELECT CAST('test' AS CHAR CHARACTER SET utf8mb4 COLLATE utf8mb4_bin);
Normally, you cannot compare a BLOB
value or other binary string in case-insensitive fashion because
binary strings use the binary
character set,
which has no collation with the concept of lettercase. To perform
a case-insensitive comparison, first use the
CONVERT()
or
CAST()
function to convert the
value to a nonbinary string. Comparisons of the resulting string
use its collation. For example, if the conversion result character
set has a case-insensitive collation, a
LIKE
operation is not case-sensitive.
That is true for the following operation because the default
utf8mb4
collation
(utf8mb4_0900_ai_ci
) is not case-sensitive:
SELECT 'A' LIKE CONVERT(blob_col
USING utf8mb4) FROMtbl_name
;
To specify a particular collation for the converted string, use a
COLLATE
clause following the
CONVERT()
call:
SELECT 'A' LIKE CONVERT(blob_col
USING utf8mb4) COLLATE utf8mb4_unicode_ci FROMtbl_name
;
To use a different character set, substitute its name for
utf8mb4
in the preceding statements (and
similarly to use a different collation).
CONVERT()
and
CAST()
can be used more generally
for comparing strings represented in different character sets. For
example, a comparison of these strings results in an error because
they have different character sets:
mysql>SET @s1 = _latin1 'abc', @s2 = _latin2 'abc';
mysql>SELECT @s1 = @s2;
ERROR 1267 (HY000): Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (latin2_general_ci,IMPLICIT) for operation '='
Converting one of the strings to a character set compatible with the other enables the comparison to occur without error:
mysql> SELECT @s1 = CONVERT(@s2 USING latin1);
+---------------------------------+
| @s1 = CONVERT(@s2 USING latin1) |
+---------------------------------+
| 1 |
+---------------------------------+
For string literals, another way to specify the character set is
to use a character set introducer. _latin1
and
_latin2
in the preceding example are instances
of introducers. Unlike conversion functions such as
CAST()
, or
CONVERT()
, which convert a string
from one character set to another, an introducer designates a
string literal as having a particular character set, with no
conversion involved. For more information, see
Section 10.3.8, “Character Set Introducers”.
Character set conversion is also useful preceding lettercase
conversion of binary strings.
LOWER()
and
UPPER()
are ineffective when
applied directly to binary strings because the concept of
lettercase does not apply. To perform lettercase conversion of a
binary string, first convert it to a nonbinary string using a
character set appropriate for the data stored in the string:
mysql>SET @str = BINARY 'New York';
mysql>SELECT LOWER(@str), LOWER(CONVERT(@str USING utf8mb4));
+-------------+------------------------------------+ | LOWER(@str) | LOWER(CONVERT(@str USING utf8mb4)) | +-------------+------------------------------------+ | New York | new york | +-------------+------------------------------------+
Be aware that if you convert an indexed column using
BINARY
,
CAST()
, or
CONVERT()
, MySQL may not be able to
use the index efficiently.
The cast functions are useful for creating a column with a
specific type in a
CREATE TABLE ...
SELECT
statement:
mysql>CREATE TABLE new_table SELECT CAST('2000-01-01' AS DATE) AS c1;
mysql>SHOW CREATE TABLE new_table\G
*************************** 1. row *************************** Table: new_table Create Table: CREATE TABLE `new_table` ( `c1` date DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
The cast functions are useful for sorting
ENUM
columns in lexical order.
Normally, sorting of ENUM
columns
occurs using the internal numeric values. Casting the values to
CHAR
results in a lexical sort:
SELECTenum_col
FROMtbl_name
ORDER BY CAST(enum_col
AS CHAR);
CAST()
also changes the result if
you use it as part of a more complex expression such as
CONCAT('Date: ',CAST(NOW() AS
DATE))
.
For temporal values, there is little need to use
CAST()
to extract data in different
formats. Instead, use a function such as
EXTRACT()
,
DATE_FORMAT()
, or
TIME_FORMAT()
. See
Section 12.7, “Date and Time Functions”.
To cast a string to a number, you normally need do nothing other than use the string value in numeric context:
mysql> SELECT 1+'1';
-> 2
That is also true for hexadecimal and bit literals, which are binary strings by default:
mysql>SELECT X'41', X'41'+0;
-> 'A', 65 mysql>SELECT b'1100001', b'1100001'+0;
-> 'a', 97
A string used in an arithmetic operation is converted to a floating-point number during expression evaluation.
A number used in string context is converted to a string:
mysql> SELECT CONCAT('hello you ',2);
-> 'hello you 2'
For information about implicit conversion of numbers to strings, see Section 12.3, “Type Conversion in Expression Evaluation”.
MySQL supports arithmetic with both signed and unsigned 64-bit
values. For numeric operators (such as
+
or
-
) where one of the
operands is an unsigned integer, the result is unsigned by default
(see Section 12.6.1, “Arithmetic Operators”). To override this,
use the SIGNED
or UNSIGNED
cast operator to cast a value to a signed or unsigned 64-bit
integer, respectively.
mysql>SELECT 1 - 2;
-> -1 mysql>SELECT CAST(1 - 2 AS UNSIGNED);
-> 18446744073709551615 mysql>SELECT CAST(CAST(1 - 2 AS UNSIGNED) AS SIGNED);
-> -1
If either operand is a floating-point value, the result is a
floating-point value and is not affected by the preceding rule.
(In this context, DECIMAL
column
values are regarded as floating-point values.)
mysql> SELECT CAST(1 AS UNSIGNED) - 2.0;
-> -1.0
The SQL mode affects the result of conversion operations (see Section 5.1.11, “Server SQL Modes”). Examples:
For conversion of a “zero” date string to a date,
CONVERT()
andCAST()
returnNULL
and produce a warning when theNO_ZERO_DATE
SQL mode is enabled.For integer subtraction, if the
NO_UNSIGNED_SUBTRACTION
SQL mode is enabled, the subtraction result is signed even if any operand is unsigned.
The following list describes the available cast functions and operators:
BINARY
expr
The
BINARY
operator converts the expression to a binary string (a string that has thebinary
character set andbinary
collation). A common use forBINARY
is to force a character string comparison to be done byte by byte using numeric byte values rather than character by character. TheBINARY
operator also causes trailing spaces in comparisons to be significant. For information about the differences between thebinary
collation of thebinary
character set and the_bin
collations of nonbinary character sets, see Section 10.8.5, “The binary Collation Compared to _bin Collations”.mysql>
SELECT 'a' = 'A';
-> 1 mysql>SELECT BINARY 'a' = 'A';
-> 0 mysql>SELECT 'a' = 'a ';
-> 1 mysql>SELECT BINARY 'a' = 'a ';
-> 0In a comparison,
BINARY
affects the entire operation; it can be given before either operand with the same result.To convert a string expression to a binary string, these constructs are equivalent:
BINARY
expr
CAST(expr
AS BINARY) CONVERT(expr
USING BINARY)If a value is a string literal, it can be designated as a binary string without performing any conversion by using the
_binary
character set introducer:mysql>
SELECT 'a' = 'A';
-> 1 mysql>SELECT _binary 'a' = 'A';
-> 0For information about introducers, see Section 10.3.8, “Character Set Introducers”.
The
BINARY
operator in expressions differs in effect from theBINARY
attribute in character column definitions. A character column defined with theBINARY
attribute is assigned the table default character set and the binary (_bin
) collation of that character set. Every nonbinary character set has a_bin
collation. For example, if the table default character set isutf8mb4
, these two column definitions are equivalent:CHAR(10) BINARY CHAR(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
The use of
CHARACTER SET binary
in the definition of aCHAR
,VARCHAR
, orTEXT
column causes the column to be treated as the corresponding binary string data type. For example, the following pairs of definitions are equivalent:CHAR(10) CHARACTER SET binary BINARY(10) VARCHAR(10) CHARACTER SET binary VARBINARY(10) TEXT CHARACTER SET binary BLOB
CAST(
timestamp_value
AT TIME ZONEtimezone_specifier
AS DATETIME[(precision
)])timezone_specifier
: [INTERVAL] '+00:00' | 'UTC'The
CAST()
function takes an expression of any type and produces a result value of the specified type, similar toCONVERT()
. For more information, see the description ofCONVERT()
.In MySQL 8.0.17 and later,
InnoDB
allows the use of an additionalARRAY
keyword for creating a multi-valued index on aJSON
array as part ofCREATE INDEX
,CREATE TABLE
, andALTER TABLE
statements.ARRAY
is not supported except when used to create a multi-valued index in one of these statements, in which case it is required. The column being indexed must be a column of typeJSON
. WithARRAY
, thetype
following theAS
keyword may specify any of the types supported byCAST()
, with the exceptions ofBINARY
,JSON
, andYEAR
. For syntax information and examples, as well as other relevant information, see Multi-Valued Indexes.Beginning with MySQL 8.0.22,
CAST()
supports retrieval of aTIMESTAMP
value as being in UTC, using theAT TIMEZONE
operator. The only supported time zone is UTC; this can be specified as either of'+00:00'
or'UTC'
. The only return type supported by this syntax isDATETIME
, with an optional precision specifier in the range of 0 to 6, inclusive.TIMESTAMP
values that use timezone offsets are also supported.mysql>
SELECT @@system_time_zone;
+--------------------+ | @@system_time_zone | +--------------------+ | EDT | +--------------------+ 1 row in set (0.00 sec) mysql>CREATE TABLE TZ (c TIMESTAMP);
Query OK, 0 rows affected (0.41 sec) mysql>INSERT INTO tz VALUES
>ROW(CURRENT_TIMESTAMP),
>ROW('2020-07-28 14:50:15+1:00');
Query OK, 1 row affected (0.08 sec) mysql>TABLE tz;
+---------------------+ | c | +---------------------+ | 2020-07-28 09:22:41 | | 2020-07-28 09:50:15 | +---------------------+ 2 rows in set (0.00 sec) mysql>SELECT CAST(c AT TIME ZONE '+00:00' AS DATETIME) AS u FROM tz;
+---------------------+ | u | +---------------------+ | 2020-07-28 13:22:41 | | 2020-07-28 13:50:15 | +---------------------+ 2 rows in set (0.00 sec) mysql>SELECT CAST(c AT TIME ZONE 'UTC' AS DATETIME(2)) AS u FROM tz;
+------------------------+ | u | +------------------------+ | 2020-07-28 13:22:41.00 | | 2020-07-28 13:50:15.00 | +------------------------+ 2 rows in set (0.00 sec)If you use
'UTC'
as the time zone specifier with this form ofCAST()
, and the server raises an error such as Unknown or incorrect time zone: 'UTC', you may need to install the MySQL time zone tables (see Populating the Time Zone Tables).AT TIME ZONE
does not support theARRAY
keyword, and is not supported by theCONVERT()
function.CONVERT(
,expr
USINGtranscoding_name
)CONVERT(
expr
,type
)The
CONVERT()
function takes an expression of any type and produces a result value of the specified type.CONVERT(... USING ...)
is standard SQL syntax. The non-USING
form ofCONVERT()
is ODBC syntax.CONVERT(
converts data between different character sets. In MySQL, transcoding names are the same as the corresponding character set names. For example, this statement converts the stringexpr
USINGtranscoding_name
)'abc'
in the default character set to the corresponding string in theutf8mb4
character set:SELECT CONVERT('abc' USING utf8mb4);
CONVERT(
syntax (withoutexpr
,type
)USING
) takes an expression and atype
value specifying the result type. This operation may also be expressed asCAST(
, which is equivalent. Theseexpr
AStype
)type
values are permitted:BINARY[(
N
)]Produces a string with the
BINARY
data type. For a description of how this affects comparisons, see Section 11.3.3, “The BINARY and VARBINARY Types”. If the optional lengthN
is given,BINARY(
causes the cast to use no more thanN
)N
bytes of the argument. Values shorter thanN
bytes are padded with0x00
bytes to a length ofN
.CHAR[(
N
)] [charset_info
]Produces a string with the
CHAR
data type. If the optional lengthN
is given,CHAR(
causes the cast to use no more thanN
)N
characters of the argument. No padding occurs for values shorter thanN
characters.With no
charset_info
clause,CHAR
produces a string with the default character set. To specify the character set explicitly, thesecharset_info
values are permitted:CHARACTER SET
: Produces a string with the given character set.charset_name
ASCII
: Shorthand forCHARACTER SET latin1
.UNICODE
: Shorthand forCHARACTER SET ucs2
.
In all cases, the string has the character set default collation.
DATE
Produces a
DATE
value.DATETIME
Produces a
DATETIME
value.DECIMAL[(
M
[,D
])]Produces a
DECIMAL
value. If the optionalM
andD
values are given, they specify the maximum number of digits (the precision) and the number of digits following the decimal point (the scale).DOUBLE
Produces a
DOUBLE
result. Added in MySQL 8.0.17.FLOAT[(
p
)]If the precision
p
is not specified, produces a result of typeFLOAT
. Ifp
is provided and 0 <= <p
<= 24, the result is of typeFLOAT
. If 25 <=p
<= 53, the result is of typeDOUBLE
. Ifp
< 0 orp
> 53, an error is returned. Added in MySQL 8.0.17.JSON
Produces a
JSON
value. For details on the rules for conversion of values betweenJSON
and other types, see Comparison and Ordering of JSON Values.NCHAR[(
N
)]Like
CHAR
, but produces a string with the national character set. See Section 10.3.7, “The National Character Set”.Unlike
CHAR
,NCHAR
does not permit trailing character set information to be specified.REAL
Produces a result of type
REAL
. This is actuallyFLOAT
ifREAL_AS_FLOAT
SQL mode is enabled; otherwise the result is of typeDOUBLE
.SIGNED [INTEGER]
Produces a signed integer value.
TIME
Produces a
TIME
value.UNSIGNED [INTEGER]
Produces an unsigned integer value.
YEAR
Produces a
YEAR
value. Added in MySQL 8.0.22. The rules governing conversion toYEAR
are listed here:For a four-digit number in the range 1901-2155 inclusive, or for a string which can be interpreted as a four-digit number in this range, return the corresponding
YEAR
value.For a number consisting of one or two digits, or for a string which can be interpeted as such a number, return a
YEAR
value as follows:If the number is in the range 1-69 inclusive, add 2000 and return the sum.
If the number is in the range 70-99 inclusive, add 1900 and return the sum.
For a string which evaluates to 0, return 2000.
For the number 0, return 0.
For a
DATE
,DATETIME
, orTIMESTAMP
value, return theYEAR
portion of the value. For aTIME
value, return the current year.If you do not specify the type of a
TIME
argument, you may get a different result from what you expect, as shown here:mysql>
SELECT CONVERT("11:35:00", YEAR), CONVERT(TIME "11:35:00", YEAR);
+---------------------------+--------------------------------+ | CONVERT("11:35:00", YEAR) | CONVERT(TIME "11:35:00", YEAR) | +---------------------------+--------------------------------+ | 2011 | 2020 | +---------------------------+--------------------------------+If the argument is of type
DECIMAL
,DOUBLE
,DECIMAL
, orREAL
, round the value to the nearest integer, then attempt to cast the value toYEAR
using the rules for integer values, as shown here:mysql>
SELECT CONVERT(1944.35, YEAR), CONVERT(1944.50, YEAR);
+------------------------+------------------------+ | CONVERT(1944.35, YEAR) | CONVERT(1944.50, YEAR) | +------------------------+------------------------+ | 1944 | 1945 | +------------------------+------------------------+ 1 row in set (0.00 sec) mysql>SELECT CONVERT(66.35, YEAR), CONVERT(66.50, YEAR);
+----------------------+----------------------+ | CONVERT(66.35, YEAR) | CONVERT(66.50, YEAR) | +----------------------+----------------------+ | 2066 | 2067 | +----------------------+----------------------+ 1 row in set (0.00 sec)For a value that cannot be successfully converted to
YEAR
, returnNULL
.
A string value containing non-numeric characters which must be truncated prior to conversion raises a warning, as shown here:
mysql>
SELECT CONVERT("1979aaa", YEAR);
+--------------------------+ | CONVERT("1979aaa", YEAR) | +--------------------------+ | 1979 | +--------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS;
+---------+------+-------------------------------------------+ | Level | Code | Message | +---------+------+-------------------------------------------+ | Warning | 1292 | Truncated incorrect YEAR value: '1979aaa' | +---------+------+-------------------------------------------+
Table 12.16 XML Functions
Name | Description |
---|---|
ExtractValue() |
Extract a value from an XML string using XPath notation |
UpdateXML() |
Return replaced XML fragment |
This section discusses XML and related functionality in MySQL.
It is possible to obtain XML-formatted output from MySQL in the
mysql and mysqldump
clients by invoking them with the
--xml
option. See
Section 4.5.1, “mysql — The MySQL Command-Line Client”, and Section 4.5.4, “mysqldump — A Database Backup Program”.
Two functions providing basic XPath 1.0 (XML Path Language, version 1.0) capabilities are available. Some basic information about XPath syntax and usage is provided later in this section; however, an in-depth discussion of these topics is beyond the scope of this manual, and you should refer to the XML Path Language (XPath) 1.0 standard for definitive information. A useful resource for those new to XPath or who desire a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.
These functions remain under development. We continue to improve these and other aspects of XML and XPath functionality in MySQL 8.0 and onwards. You may discuss these, ask questions about them, and obtain help from other users with them in the MySQL XML User Forum.
XPath expressions used with these functions support user variables and local stored program variables. User variables are weakly checked; variables local to stored programs are strongly checked (see also Bug #26518):
User variables (weak checking). Variables using the syntax
$@
(that is, user variables) are not checked. No warnings or errors are issued by the server if a variable has the wrong type or has previously not been assigned a value. This also means the user is fully responsible for any typographical errors, since no warnings are given if (for example)variable_name
$@myvairable
is used where$@myvariable
was intended.Example:
mysql>
SET @xml = '<a><b>X</b><b>Y</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SET @i =1, @j = 2;
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @i, ExtractValue(@xml, '//b[$@i]');
+------+--------------------------------+ | @i | ExtractValue(@xml, '//b[$@i]') | +------+--------------------------------+ | 1 | X | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @j, ExtractValue(@xml, '//b[$@j]');
+------+--------------------------------+ | @j | ExtractValue(@xml, '//b[$@j]') | +------+--------------------------------+ | 2 | Y | +------+--------------------------------+ 1 row in set (0.00 sec) mysql>SELECT @k, ExtractValue(@xml, '//b[$@k]');
+------+--------------------------------+ | @k | ExtractValue(@xml, '//b[$@k]') | +------+--------------------------------+ | NULL | | +------+--------------------------------+ 1 row in set (0.00 sec)Variables in stored programs (strong checking). Variables using the syntax
$
can be declared and used with these functions when they are called inside stored programs. Such variables are local to the stored program in which they are defined, and are strongly checked for type and value.variable_name
Example:
mysql>
DELIMITER |
mysql>CREATE PROCEDURE myproc ()
->BEGIN
->DECLARE i INT DEFAULT 1;
->DECLARE xml VARCHAR(25) DEFAULT '<a>X</a><a>Y</a><a>Z</a>';
-> ->WHILE i < 4 DO
->SELECT xml, i, ExtractValue(xml, '//a[$i]');
->SET i = i+1;
->END WHILE;
->END |
Query OK, 0 rows affected (0.01 sec) mysql>DELIMITER ;
mysql>CALL myproc();
+--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 1 | X | +--------------------------+---+------------------------------+ 1 row in set (0.00 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 2 | Y | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec) +--------------------------+---+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+---+------------------------------+ | <a>X</a><a>Y</a><a>Z</a> | 3 | Z | +--------------------------+---+------------------------------+ 1 row in set (0.01 sec)Parameters. Variables used in XPath expressions inside stored routines that are passed in as parameters are also subject to strong checking.
Expressions containing user variables or variables local to stored programs must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.
A user variable used to store an XPath expression is treated as an empty string. Because of this, it is not possible to store an XPath expression as a user variable. (Bug #32911)
ExtractValue(
xml_frag
,xpath_expr
)ExtractValue()
takes two string arguments, a fragment of XML markupxml_frag
and an XPath expressionxpath_expr
(also known as a locator); it returns the text (CDATA
) of the first text node which is a child of the element or elements matched by the XPath expression.Using this function is the equivalent of performing a match using the
xpath_expr
after appending/text()
. In other words,ExtractValue('<a><b>Sakila</b></a>', '/a/b')
andExtractValue('<a><b>Sakila</b></a>', '/a/b/text()')
produce the same result.If multiple matches are found, the content of the first child text node of each matching element is returned (in the order matched) as a single, space-delimited string.
If no matching text node is found for the expression (including the implicit
/text()
)—for whatever reason, as long asxpath_expr
is valid, andxml_frag
consists of elements which are properly nested and closed—an empty string is returned. No distinction is made between a match on an empty element and no match at all. This is by design.If you need to determine whether no matching element was found in
xml_frag
or such an element was found but contained no child text nodes, you should test the result of an expression that uses the XPathcount()
function. For example, both of these statements return an empty string, as shown here:mysql>
SELECT ExtractValue('<a><b/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><b/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', '/a/b');
+-------------------------------------+ | ExtractValue('<a><c/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec)However, you can determine whether there was actually a matching element using the following:
mysql>
SELECT ExtractValue('<a><b/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><b/></a>', 'count(/a/b)') | +-------------------------------------+ | 1 | +-------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a><c/></a>', 'count(/a/b)');
+-------------------------------------+ | ExtractValue('<a><c/></a>', 'count(/a/b)') | +-------------------------------------+ | 0 | +-------------------------------------+ 1 row in set (0.01 sec)ImportantExtractValue()
returns onlyCDATA
, and does not return any tags that might be contained within a matching tag, nor any of their content (see the result returned asval1
in the following example).mysql>
SELECT
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a') AS val1,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b') AS val2,
->ExtractValue('<a>ccc<b>ddd</b></a>', '//b') AS val3,
->ExtractValue('<a>ccc<b>ddd</b></a>', '/b') AS val4,
->ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
+------+------+------+------+---------+ | val1 | val2 | val3 | val4 | val5 | +------+------+------+------+---------+ | ccc | ddd | ddd | | ddd eee | +------+------+------+------+---------+This function uses the current SQL collation for making comparisons with
contains()
, performing the same collation aggregation as other string functions (such asCONCAT()
), in taking into account the collation coercibility of their arguments; see Section 10.8.4, “Collation Coercibility in Expressions”, for an explanation of the rules governing this behavior.(Previously, binary—that is, case-sensitive—comparison was always used.)
NULL
is returned ifxml_frag
contains elements which are not properly nested or closed, and a warning is generated, as shown in this example:mysql>
SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1525 Message: Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)UpdateXML(
xml_target
,xpath_expr
,new_xml
)This function replaces a single portion of a given fragment of XML markup
xml_target
with a new XML fragmentnew_xml
, and then returns the changed XML. The portion ofxml_target
that is replaced matches an XPath expressionxpath_expr
supplied by the user.If no expression matching
xpath_expr
is found, or if multiple matches are found, the function returns the originalxml_target
XML fragment. All three arguments should be strings.mysql>
SELECT
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
->UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
->UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
->UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
->\G
*************************** 1. row *************************** val1: <e>fff</e> val2: <a><b>ccc</b><d></d></a> val3: <a><e>fff</e><d></d></a> val4: <a><b>ccc</b><e>fff</e></a> val5: <a><d></d><b>ccc</b><d></d></a>
A discussion in depth of XPath syntax and usage are beyond the scope of this manual. Please see the XML Path Language (XPath) 1.0 specification for definitive information. A useful resource for those new to XPath or who are wishing a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.
Descriptions and examples of some basic XPath expressions follow:
/
tag
Matches
<
if and only iftag
/><
is the root element.tag
/>Example:
/a
has a match in<a><b/></a>
because it matches the outermost (root) tag. It does not match the innera
element in<b><a/></b>
because in this instance it is the child of another element./
tag1
/tag2
Matches
<
if and only if it is a child oftag2
/><
, andtag1
/><
is the root element.tag1
/>Example:
/a/b
matches theb
element in the XML fragment<a><b/></a>
because it is a child of the root elementa
. It does not have a match in<b><a/></b>
because in this case,b
is the root element (and hence the child of no other element). Nor does the XPath expression have a match in<a><c><b/></c></a>
; here,b
is a descendant ofa
, but not actually a child ofa
.This construct is extendable to three or more elements. For example, the XPath expression
/a/b/c
matches thec
element in the fragment<a><b><c/></b></a>
.//
tag
Matches any instance of
<
.tag
>Example:
//a
matches thea
element in any of the following:<a><b><c/></b></a>
;<c><a><b/></a></b>
;<c><b><a/></b></c>
.//
can be combined with/
. For example,//a/b
matches theb
element in either of the fragments<a><b/></a>
or<c><a><b/></a></c>
.Note//
is the equivalent oftag
/descendant-or-self::*/
. A common error is to confuse this withtag
/descendant-or-self::
, although the latter expression can actually lead to very different results, as can be seen here:tag
mysql>
SET @xml = '<a><b><c>w</c><b>x</b><d>y</d>z</b></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT @xml;
+-----------------------------------------+ | @xml | +-----------------------------------------+ | <a><b><c>w</c><b>x</b><d>y</d>z</b></a> | +-----------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[1]');
+------------------------------+ | ExtractValue(@xml, '//b[1]') | +------------------------------+ | x z | +------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '//b[2]');
+------------------------------+ | ExtractValue(@xml, '//b[2]') | +------------------------------+ | | +------------------------------+ 1 row in set (0.01 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[1]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[1]') | +---------------------------------------------------+ | x z | +---------------------------------------------------+ 1 row in set (0.06 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::*/b[2]');
+---------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::*/b[2]') | +---------------------------------------------------+ | | +---------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[1]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[1]') | +-------------------------------------------------+ | z | +-------------------------------------------------+ 1 row in set (0.00 sec) mysql>SELECT ExtractValue(@xml, '/descendant-or-self::b[2]');
+-------------------------------------------------+ | ExtractValue(@xml, '/descendant-or-self::b[2]') | +-------------------------------------------------+ | x | +-------------------------------------------------+ 1 row in set (0.00 sec)The
*
operator acts as a “wildcard” that matches any element. For example, the expression/*/b
matches theb
element in either of the XML fragments<a><b/></a>
or<c><b/></c>
. However, the expression does not produce a match in the fragment<b><a/></b>
becauseb
must be a child of some other element. The wildcard may be used in any position: The expression/*/b/*
matches any child of ab
element that is itself not the root element.You can match any of several locators using the
|
(UNION
) operator. For example, the expression//b|//c
matches allb
andc
elements in the XML target.It is also possible to match an element based on the value of one or more of its attributes. This done using the syntax
. For example, the expressiontag
[@attribute
="value
"]//b[@id="idB"]
matches the secondb
element in the fragment<a><b id="idA"/><c/><b id="idB"/></a>
. To match against any element having
, use the XPath expressionattribute
="value
"//*[
.attribute
="value
"]To filter multiple attribute values, simply use multiple attribute-comparison clauses in succession. For example, the expression
//b[@c="x"][@d="y"]
matches the element<b c="x" d="y"/>
occurring anywhere in a given XML fragment.To find elements for which the same attribute matches any of several values, you can use multiple locators joined by the
|
operator. For example, to match allb
elements whosec
attributes have either of the values 23 or 17, use the expression//b[@c="23"]|//b[@c="17"]
. You can also use the logicalor
operator for this purpose://b[@c="23" or @c="17"]
.NoteThe difference between
or
and|
is thator
joins conditions, while|
joins result sets.
XPath Limitations. The XPath syntax supported by these functions is currently subject to the following limitations:
Nodeset-to-nodeset comparison (such as
'/a/b[@c=@d]'
) is not supported.All of the standard XPath comparison operators are supported. (Bug #22823)
Relative locator expressions are resolved in the context of the root node. For example, consider the following query and result:
mysql>
SELECT ExtractValue(
->'<a><b c="1">X</b><b c="2">Y</b></a>',
->'a/b'
->) AS result;
+--------+ | result | +--------+ | X Y | +--------+ 1 row in set (0.03 sec)In this case, the locator
a/b
resolves to/a/b
.Relative locators are also supported within predicates. In the following example,
d[../@c="1"]
is resolved as/a/b[@c="1"]/d
:mysql>
SELECT ExtractValue(
->'<a>
-><b c="1"><d>X</d></b>
-><b c="2"><d>X</d></b>
-></a>',
->'a/b/d[../@c="1"]')
->AS result;
+--------+ | result | +--------+ | X | +--------+ 1 row in set (0.00 sec)Locators prefixed with expressions that evaluate as scalar values—including variable references, literals, numbers, and scalar function calls—are not permitted, and their use results in an error.
The
::
operator is not supported in combination with node types such as the following:axis
::comment()axis
::text()axis
::processing-instructions()axis
::node()
However, name tests (such as
andaxis
::name
) are supported, as shown in these examples:axis
::*mysql>
SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::b') | +-------------------------------------------------------+ | x | +-------------------------------------------------------+ 1 row in set (0.02 sec) mysql>SELECT ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*');
+-------------------------------------------------------+ | ExtractValue('<a><b>x</b><c>y</c></a>','/a/child::*') | +-------------------------------------------------------+ | x y | +-------------------------------------------------------+ 1 row in set (0.01 sec)“Up-and-down” navigation is not supported in cases where the path would lead “above” the root element. That is, you cannot use expressions which match on descendants of ancestors of a given element, where one or more of the ancestors of the current element is also an ancestor of the root element (see Bug #16321).
The following XPath functions are not supported, or have known issues as indicated:
id()
lang()
local-name()
name()
namespace-uri()
normalize-space()
starts-with()
string()
substring-after()
substring-before()
translate()
The following axes are not supported:
following-sibling
following
preceding-sibling
preceding
XPath expressions passed as arguments to
ExtractValue()
and
UpdateXML()
may contain the colon
character (:
) in element selectors, which
enables their use with markup employing XML namespaces notation.
For example:
mysql>SET @xml = '<a>111<b:c>222<d>333</d><e:f>444</e:f></b:c></a>';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT ExtractValue(@xml, '//e:f');
+-----------------------------+ | ExtractValue(@xml, '//e:f') | +-----------------------------+ | 444 | +-----------------------------+ 1 row in set (0.00 sec) mysql>SELECT UpdateXML(@xml, '//b:c', '<g:h>555</g:h>');
+--------------------------------------------+ | UpdateXML(@xml, '//b:c', '<g:h>555</g:h>') | +--------------------------------------------+ | <a>111<g:h>555</g:h></a> | +--------------------------------------------+ 1 row in set (0.00 sec)
This is similar in some respects to what is permitted by
Apache Xalan and
some other parsers, and is much simpler than requiring namespace
declarations or the use of the namespace-uri()
and local-name()
functions.
Error handling.
For both ExtractValue()
and
UpdateXML()
, the XPath locator
used must be valid and the XML to be searched must consist of
elements which are properly nested and closed. If the locator is
invalid, an error is generated:
mysql> SELECT ExtractValue('<a>c</a><b/>', '/&a');
ERROR 1105 (HY000): XPATH syntax error: '&a'
If xml_frag
does not consist of
elements which are properly nested and closed,
NULL
is returned and a warning is generated, as
shown in this example:
mysql>SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+ | ExtractValue('<a>c</a><b', '//a') | +-----------------------------------+ | NULL | +-----------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNINGS\G
*************************** 1. row *************************** Level: Warning Code: 1525 Message: Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' 1 row in set (0.00 sec) mysql>SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+ | ExtractValue('<a>c</a><b/>', '//a') | +-------------------------------------+ | c | +-------------------------------------+ 1 row in set (0.00 sec)
The replacement XML used as the third argument to
UpdateXML()
is
not checked to determine whether it
consists solely of elements which are properly nested and
closed.
XPath Injection. code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. It is based on exploiting assumptions made by developers about the type and content of data input from users. XPath is no exception in this regard.
A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:
//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id
This is the XPath equivalent of an SQL statement like this one:
SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';
A PHP application employing XPath might handle the login process like this:
<?php $file = "users.xml"; $login = $POST["login"]; $password = $POST["password"]; $xpath = "//user[login/text()=$login and password/text()=$password]/attribute::id"; if( file_exists($file) ) { $xml = simplexml_load_file($file); if($result = $xml->xpath($xpath)) echo "You are now logged in as user $result[0]."; else echo "Invalid login name or password."; } else exit("Failed to open $file."); ?>
No checks are performed on the input. This means that a malevolent
user can “short-circuit” the test by entering
' or 1=1
for both the login name and password,
resulting in $xpath
being evaluated as shown
here:
//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id
Since the expression inside the square brackets always evaluates
as true
, it is effectively the same as this
one, which matches the id
attribute of every
user
element in the XML document:
//user/attribute::id
One way in which this particular attack can be circumvented is
simply by quoting the variable names to be interpolated in the
definition of $xpath
, forcing the values passed
from a Web form to be converted to strings:
$xpath = "//user[login/text()='$login' and password/text()='$password']/attribute::id";
This is the same strategy that is often recommended for preventing SQL injection attacks. In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:
Never accepted untested data from users in your application.
Check all user-submitted data for type; reject or convert data that is of the wrong type
Test numeric data for out of range values; truncate, round, or reject values that are out of range. Test strings for illegal characters and either strip them out or reject input containing them.
Do not output explicit error messages that might provide an unauthorized user with clues that could be used to compromise the system; log these to a file or database table instead.
Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).
It is also important to check the output being sent back to the
client. Consider what can happen when we use the MySQL
ExtractValue()
function:
mysql>SELECT ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->) AS id;
+-------------------------------+ | id | +-------------------------------+ | 00327 13579 02403 42354 28570 | +-------------------------------+ 1 row in set (0.01 sec)
Because ExtractValue()
returns
multiple matches as a single space-delimited string, this
injection attack provides every valid ID contained within
users.xml
to the user as a single row of
output. As an extra safeguard, you should also test output before
returning it to the user. Here is a simple example:
mysql>SELECT @id = ExtractValue(
->LOAD_FILE('users.xml'),
->'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'
->);
Query OK, 0 rows affected (0.00 sec) mysql>SELECT IF(
->INSTR(@id, ' ') = 0,
->@id,
->'Unable to retrieve user ID')
->AS singleID;
+----------------------------+ | singleID | +----------------------------+ | Unable to retrieve user ID | +----------------------------+ 1 row in set (0.00 sec)
In general, the guidelines for returning data to users securely are the same as for accepting user input. These can be summed up as:
Always test outgoing data for type and permissible values.
Never permit unauthorized users to view error messages that might provide information about the application that could be used to exploit it.
Bit functions and operators comprise
BIT_COUNT()
,
BIT_AND()
,
BIT_OR()
,
BIT_XOR()
,
&
,
|
,
^
,
~
,
<<
, and
>>
.
(The BIT_AND()
,
BIT_OR()
, and
BIT_XOR()
aggregate functions are
described in Section 12.20.1, “Aggregate Function Descriptions”.) Prior to
MySQL 8.0, bit functions and operators required
BIGINT
(64-bit integer) arguments
and returned BIGINT
values, so they
had a maximum range of 64 bits.
Non-BIGINT
arguments were converted
to BIGINT
prior to performing the
operation and truncation could occur.
In MySQL 8.0, bit functions and operators permit
binary string type arguments
(BINARY
,
VARBINARY
, and the
BLOB
types) and return a value of
like type, which enables them to take arguments and produce return
values larger than 64 bits. Nonbinary string arguments are
converted to BIGINT
and processed
as such, as before.
An implication of this change in behavior is that bit operations on binary string arguments might produce a different result in MySQL 8.0 than in 5.7. For information about how to prepare in MySQL 5.7 for potential incompatibilities between MySQL 5.7 and 8.0, see Bit Functions and Operators, in MySQL 5.7 Reference Manual.
Bit operations prior to MySQL 8.0 handle only unsigned 64-bit
integer argument and result values (that is, unsigned
BIGINT
values). Conversion of
arguments of other types to
BIGINT
occurs as necessary.
Examples:
This statement operates on numeric literals, treated as unsigned 64-bit integers:
mysql>
SELECT 127 | 128, 128 << 2, BIT_COUNT(15);
+-----------+----------+---------------+ | 127 | 128 | 128 << 2 | BIT_COUNT(15) | +-----------+----------+---------------+ | 255 | 512 | 4 | +-----------+----------+---------------+This statement performs to-number conversions on the string arguments (
'127'
to127
, and so forth) before performing the same operations as the first statement and producing the same results:mysql>
SELECT '127' | '128', '128' << 2, BIT_COUNT('15');
+---------------+------------+-----------------+ | '127' | '128' | '128' << 2 | BIT_COUNT('15') | +---------------+------------+-----------------+ | 255 | 512 | 4 | +---------------+------------+-----------------+This statement uses hexadecimal literals for the bit-operation arguments. MySQL by default treats hexadecimal literals as binary strings, but in numeric context evaluates them as numbers (see Section 9.1.4, “Hexadecimal Literals”). Prior to MySQL 8.0, numeric context includes bit operations. Examples:
mysql>
SELECT X'7F' | X'80', X'80' << 2, BIT_COUNT(X'0F');
+---------------+------------+------------------+ | X'7F' | X'80' | X'80' << 2 | BIT_COUNT(X'0F') | +---------------+------------+------------------+ | 255 | 512 | 4 | +---------------+------------+------------------+Handling of bit-value literals in bit operations is similar to hexadecimal literals (that is, as numbers).
MySQL 8.0 extends bit operations to handle binary string arguments directly (without conversion) and produce binary string results. (Arguments that are not integers or binary strings are still converted to integers, as before.) This extension enhances bit operations in the following ways:
Bit operations become possible on values longer than 64 bits.
It is easier to perform bit operations on values that are more naturally represented as binary strings than as integers.
For example, consider UUID values and IPv6 addresses, which have human-readable text formats like this:
UUID: 6ccd780c-baba-1026-9564-5b8c656024db IPv6: fe80::219:d1ff:fe91:1a72
It is cumbersome to operate on text strings in those formats. An
alternative is convert them to fixed-length binary strings
without delimiters. UUID_TO_BIN()
and INET6_ATON()
each produce a
value of data type BINARY(16)
, a
binary string 16 bytes (128 bits) long. The following statements
illustrate this (HEX()
is used to produce
displayable values):
mysql>SELECT HEX(UUID_TO_BIN('6ccd780c-baba-1026-9564-5b8c656024db'));
+----------------------------------------------------------+ | HEX(UUID_TO_BIN('6ccd780c-baba-1026-9564-5b8c656024db')) | +----------------------------------------------------------+ | 6CCD780CBABA102695645B8C656024DB | +----------------------------------------------------------+ mysql>SELECT HEX(INET6_ATON('fe80::219:d1ff:fe91:1a72'));
+---------------------------------------------+ | HEX(INET6_ATON('fe80::219:d1ff:fe91:1a72')) | +---------------------------------------------+ | FE800000000000000219D1FFFE911A72 | +---------------------------------------------+
Those binary values are easily manipulable with bit operations to perform actions such as extracting the timestamp from UUID values, or extracting the network and host parts of IPv6 addresses. (For examples, see later in this discussion.)
Arguments that count as binary strings include column values,
routine parameters, local variables, and user-defined variables
that have a binary string type:
BINARY
,
VARBINARY
, or one of the
BLOB
types.
What about hexadecimal literals and bit literals? Recall that those are binary strings by default in MySQL, but numbers in numeric context. How are they handled for bit operations in MySQL 8.0? Does MySQL continue to evaluate them in numeric context, as is done prior to MySQL 8.0? Or do bit operations evaluate them as binary strings, now that binary strings can be handled “natively” without conversion?
Answer: It has been common to specify arguments to bit
operations using hexadecimal literals or bit literals with the
intent that they represent numbers, so MySQL continues to
evaluate bit operations in numeric context when all bit
arguments are hexadecimal or bit literals, for backward
compatility. If you require evaluation as binary strings
instead, that is easily accomplished: Use the
_binary
introducer for at least one literal.
These bit operations evaluate the hexadecimal literals and bit literals as integers:
mysql>
SELECT X'40' | X'01', b'11110001' & b'01001111';
+---------------+---------------------------+ | X'40' | X'01' | b'11110001' & b'01001111' | +---------------+---------------------------+ | 65 | 65 | +---------------+---------------------------+These bit operations evaluate the hexadecimal literals and bit literals as binary strings, due to the
_binary
introducer:mysql>
SELECT _binary X'40' | X'01', b'11110001' & _binary b'01001111';
+-----------------------+-----------------------------------+ | _binary X'40' | X'01' | b'11110001' & _binary b'01001111' | +-----------------------+-----------------------------------+ | A | A | +-----------------------+-----------------------------------+
Although the bit operations in both statements produce a result
with a numeric value of 65, the second statement operates in
binary-string context, for which 65 is ASCII
A
.
In numeric evaluation context, permitted values of hexadecimal literal and bit literal arguments have a maximum of 64 bits, as do results. By contrast, in binary-string evaluation context, permitted arguments (and results) can exceed 64 bits:
mysql> SELECT _binary X'4040404040404040' | X'0102030405060708';
+---------------------------------------------------+
| _binary X'4040404040404040' | X'0102030405060708' |
+---------------------------------------------------+
| ABCDEFGH |
+---------------------------------------------------+
There are several ways to refer to a hexadecimal literal or bit literal in a bit operation to cause binary-string evaluation:
_binaryliteral
BINARYliteral
CAST(literal
AS BINARY)
Another way to produce binary-string evaluation of hexadecimal literals or bit literals is to assign them to user-defined variables, which results in variables that have a binary string type:
mysql>SET @v1 = X'40', @v2 = X'01', @v3 = b'11110001', @v4 = b'01001111';
mysql>SELECT @v1 | @v2, @v3 & @v4;
+-----------+-----------+ | @v1 | @v2 | @v3 & @v4 | +-----------+-----------+ | A | A | +-----------+-----------+
In binary-string context, bitwise operation arguments must have
the same length or an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs:
mysql> SELECT _binary X'40' | X'0001';
ERROR 3513 (HY000): Binary operands of bitwise
operators must be of equal length
To satisfy the equal-length requirement, pad the shorter value with leading zero digits or, if the longer value begins with leading zero digits and a shorter result value is acceptable, strip them:
mysql>SELECT _binary X'0040' | X'0001';
+---------------------------+ | _binary X'0040' | X'0001' | +---------------------------+ | A | +---------------------------+ mysql>SELECT _binary X'40' | X'01';
+-----------------------+ | _binary X'40' | X'01' | +-----------------------+ | A | +-----------------------+
Padding or stripping can also be accomplished using functions
such as LPAD()
,
RPAD()
,
SUBSTR()
, or
CAST()
. In such cases, the
expression arguments are no longer all literals and
_binary
becomes unnecessary. Examples:
mysql>SELECT LPAD(X'40', 2, X'00') | X'0001';
+---------------------------------+ | LPAD(X'40', 2, X'00') | X'0001' | +---------------------------------+ | A | +---------------------------------+ mysql>SELECT X'40' | SUBSTR(X'0001', 2, 1);
+-------------------------------+ | X'40' | SUBSTR(X'0001', 2, 1) | +-------------------------------+ | A | +-------------------------------+
The following example illustrates use of bit operations to extract parts of a UUID value, in this case, the timestamp and IEEE 802 node number. This technique requires bitmasks for each extracted part.
Convert the text UUID to the corresponding 16-byte binary value so that it can be manipulated using bit operations in binary-string context:
mysql>SET @uuid = UUID_TO_BIN('6ccd780c-baba-1026-9564-5b8c656024db');
mysql>SELECT HEX(@uuid);
+----------------------------------+ | HEX(@uuid) | +----------------------------------+ | 6CCD780CBABA102695645B8C656024DB | +----------------------------------+
Construct bitmasks for the timestamp and node number parts of the value. The timestamp comprises the first three parts (64 bits, bits 0 to 63) and the node number is the last part (48 bits, bits 80 to 127):
mysql>SET @ts_mask = CAST(X'FFFFFFFFFFFFFFFF' AS BINARY(16));
mysql>SET @node_mask = CAST(X'FFFFFFFFFFFF' AS BINARY(16)) >> 80;
mysql>SELECT HEX(@ts_mask);
+----------------------------------+ | HEX(@ts_mask) | +----------------------------------+ | FFFFFFFFFFFFFFFF0000000000000000 | +----------------------------------+ mysql>SELECT HEX(@node_mask);
+----------------------------------+ | HEX(@node_mask) | +----------------------------------+ | 00000000000000000000FFFFFFFFFFFF | +----------------------------------+
The CAST(... AS BINARY(16))
function is used
here because the masks must be the same length as the UUID value
against which they are applied. The same result can be produced
using other functions to pad the masks to the required length:
SET @ts_mask= RPAD(X'FFFFFFFFFFFFFFFF' , 16, X'00'); SET @node_mask = LPAD(X'FFFFFFFFFFFF', 16, X'00') ;
Use the masks to extract the timestamp and node number parts:
mysql>SELECT HEX(@uuid & @ts_mask) AS 'timestamp part';
+----------------------------------+ | timestamp part | +----------------------------------+ | 6CCD780CBABA10260000000000000000 | +----------------------------------+ mysql>SELECT HEX(@uuid & @node_mask) AS 'node part';
+----------------------------------+ | node part | +----------------------------------+ | 000000000000000000005B8C656024DB | +----------------------------------+
The preceding example uses these bit operations: right shift
(>>
)
and bitwise AND
(&
).
UUID_TO_BIN()
takes a flag that
causes some bit rearrangement in the resulting binary UUID
value. If you use that flag, modify the extraction masks
accordingly.
The next example uses bit operations to extract the network and host parts of an IPv6 address. Suppose that the network part has a length of 80 bits. Then the host part has a length of 128 − 80 = 48 bits. To extract the network and host parts of the address, convert it to a binary string, then use bit operations in binary-string context.
Convert the text IPv6 address to the corresponding binary string:
mysql> SET @ip = INET6_ATON('fe80::219:d1ff:fe91:1a72');
Define the network length in bits:
mysql> SET @net_len = 80;
Construct network and host masks by shifting the all-ones
address left or right. To do this, begin with the address
::
, which is shorthand for all zeros, as you
can see by converting it to a binary string like this:
mysql> SELECT HEX(INET6_ATON('::')) AS 'all zeros';
+----------------------------------+
| all zeros |
+----------------------------------+
| 00000000000000000000000000000000 |
+----------------------------------+
To produce the complementary value (all ones), use the
~
operator to invert the bits:
mysql> SELECT HEX(~INET6_ATON('::')) AS 'all ones';
+----------------------------------+
| all ones |
+----------------------------------+
| FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF |
+----------------------------------+
Shift the all-ones value left or right to produce the network and host masks:
mysql>SET @net_mask = ~INET6_ATON('::') << (128 - @net_len);
mysql>SET @host_mask = ~INET6_ATON('::') >> @net_len;
Display the masks to verify that they cover the correct parts of the address:
mysql>SELECT INET6_NTOA(@net_mask) AS 'network mask';
+----------------------------+ | network mask | +----------------------------+ | ffff:ffff:ffff:ffff:ffff:: | +----------------------------+ mysql>SELECT INET6_NTOA(@host_mask) AS 'host mask';
+------------------------+ | host mask | +------------------------+ | ::ffff:255.255.255.255 | +------------------------+
Extract and display the network and host parts of the address:
mysql>SET @net_part = @ip & @net_mask;
mysql>SET @host_part = @ip & @host_mask;
mysql>SELECT INET6_NTOA(@net_part) AS 'network part';
+-----------------+ | network part | +-----------------+ | fe80::219:0:0:0 | +-----------------+ mysql>SELECT INET6_NTOA(@host_part) AS 'host part';
+------------------+ | host part | +------------------+ | ::d1ff:fe91:1a72 | +------------------+
The preceding example uses these bit operations: Complement
(~
),
left shift
(<<
),
and bitwise AND
(&
).
The remaining discussion provides details on argument handling for each group of bit operations, more information about literal-value handling in bit operations, and potential incompatibilities between MySQL 8.0 and older MySQL versions.
For &
,
|
, and
^
bit
operations, the result type depends on whether the arguments are
evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have a binary string type, and at least one of them is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to unsigned 64-bit integers as necessary.Binary-string evaluation produces a binary string of the same length as the arguments. If the arguments have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned 64-bit integer.
Examples of numeric evaluation:
mysql> SELECT 64 | 1, X'40' | X'01';
+--------+---------------+
| 64 | 1 | X'40' | X'01' |
+--------+---------------+
| 65 | 65 |
+--------+---------------+
Examples of binary-string evaluation:
mysql>SELECT _binary X'40' | X'01';
+-----------------------+ | _binary X'40' | X'01' | +-----------------------+ | A | +-----------------------+ mysql>SET @var1 = X'40', @var2 = X'01';
mysql>SELECT @var1 | @var2;
+---------------+ | @var1 | @var2 | +---------------+ | A | +---------------+
For ~
,
<<
,
and
>>
bit operations, the result type depends on whether the bit
argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument has a binary string type, and is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to an unsigned 64-bit integer as necessary.Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
For shift operations, bits shifted off the end of the value are lost without warning, regardless of the argument type. In particular, if the shift count is greater or equal to the number of bits in the bit argument, all bits in the result are 0.
Examples of numeric evaluation:
mysql> SELECT ~0, 64 << 2, X'40' << 2;
+----------------------+---------+------------+
| ~0 | 64 << 2 | X'40' << 2 |
+----------------------+---------+------------+
| 18446744073709551615 | 256 | 256 |
+----------------------+---------+------------+
Examples of binary-string evaluation:
mysql>SELECT HEX(_binary X'1111000022220000' >> 16);
+----------------------------------------+ | HEX(_binary X'1111000022220000' >> 16) | +----------------------------------------+ | 0000111100002222 | +----------------------------------------+ mysql>SELECT HEX(_binary X'1111000022220000' << 16);
+----------------------------------------+ | HEX(_binary X'1111000022220000' << 16) | +----------------------------------------+ | 0000222200000000 | +----------------------------------------+ mysql>SET @var1 = X'F0F0F0F0';
mysql>SELECT HEX(~@var1);
+-------------+ | HEX(~@var1) | +-------------+ | 0F0F0F0F | +-------------+
The BIT_COUNT()
function always
returns an unsigned 64-bit integer, or NULL
if the argument is NULL
.
mysql>SELECT BIT_COUNT(127);
+----------------+ | BIT_COUNT(127) | +----------------+ | 7 | +----------------+ mysql>SELECT BIT_COUNT(b'010101'), BIT_COUNT(_binary b'010101');
+----------------------+------------------------------+ | BIT_COUNT(b'010101') | BIT_COUNT(_binary b'010101') | +----------------------+------------------------------+ | 3 | 3 | +----------------------+------------------------------+
For the BIT_AND()
,
BIT_OR()
, and
BIT_XOR()
bit functions, the
result type depends on whether the function argument values are
evaluated as binary strings or numbers:
Binary-string evaluation occurs when the argument values have a binary string type, and the argument is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument value conversion to unsigned 64-bit integers as necessary.Binary-string evaluation produces a binary string of the same length as the argument values. If argument values have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. If the argument size exceeds 511 bytes, anER_INVALID_BITWISE_AGGREGATE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned 64-bit integer.
NULL
values do not affect the result unless
all values are NULL
. In that case, the result
is a neutral value having the same length as the length of the
argument values (all bits 1 for
BIT_AND()
, all bits 0 for
BIT_OR()
, and
BIT_XOR()
).
Example:
mysql>CREATE TABLE t (group_id INT, a VARBINARY(6));
mysql>INSERT INTO t VALUES (1, NULL);
mysql>INSERT INTO t VALUES (1, NULL);
mysql>INSERT INTO t VALUES (2, NULL);
mysql>INSERT INTO t VALUES (2, X'1234');
mysql>INSERT INTO t VALUES (2, X'FF34');
mysql>SELECT HEX(BIT_AND(a)), HEX(BIT_OR(a)), HEX(BIT_XOR(a))
FROM t GROUP BY group_id;
+-----------------+----------------+-----------------+ | HEX(BIT_AND(a)) | HEX(BIT_OR(a)) | HEX(BIT_XOR(a)) | +-----------------+----------------+-----------------+ | FFFFFFFFFFFF | 000000000000 | 000000000000 | | 1234 | FF34 | ED00 | +-----------------+----------------+-----------------+
For backward compatibility, MySQL 8.0 evaluates bit operations
in numeric context when all bit arguments are hexadecimal
literals, bit literals, or NULL
literals.
That is, bit operations on binary-string bit arguments do not
use binary-string evaluation if all bit arguments are unadorned
hexadecimal literals, bit literals, or NULL
literals. (This does not apply to such literals if they are
written with a _binary
introducer,
BINARY
operator, or other way of
specifying them explicitly as binary strings.)
The literal handling just described is the same as prior to MySQL 8.0. Examples:
These bit operations evaluate the literals in numeric context and produce a
BIGINT
result:b'0001' | b'0010' X'0008' << 8
These bit operations evaluate
NULL
in numeric context and produce aBIGINT
result that has aNULL
value:NULL & NULL NULL >> 4
In MySQL 8.0, you can cause those operations to evaluate the arguments in binary-string context by indicating explicitly that at least one argument is a binary string:
_binary b'0001' | b'0010' _binary X'0008' << 8 BINARY NULL & NULL BINARY NULL >> 4
The result of the last two expressions is
NULL
, just as without the
BINARY
operator, but the data type of the
result is a binary string type rather than an integer type.
Because bit operations can handle binary string arguments natively in MySQL 8.0, some expressions produce a different result in MySQL 8.0 than in 5.7. The five problematic expression types to watch out for are:
nonliteral_binary
{ & | ^ }binary
binary
{ & | ^ }nonliteral_binary
nonliteral_binary
{ << >> }anything
~nonliteral_binary
AGGR_BIT_FUNC
(nonliteral_binary
)
Those expressions return BIGINT
in MySQL 5.7, binary string in 8.0.
Explanation of notation:
{
: List of operators that apply to the given expression type.op1
op2
... }binary
: Any kind of binary string argument, including a hexadecimal literal, bit literal, orNULL
literal.nonliteral_binary
: An argument that is a binary string value other than a hexadecimal literal, bit literal, orNULL
literal.AGGR_BIT_FUNC
: An aggregate function that takes bit-value arguments:BIT_AND()
,BIT_OR()
,BIT_XOR()
.
For information about how to prepare in MySQL 5.7 for potential incompatibilities between MySQL 5.7 and 8.0, see Bit Functions and Operators, in MySQL 5.7 Reference Manual.
The following list describes available bit functions and operators:
Bitwise OR.
The result type depends on whether the arguments are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have a binary string type, and at least one of them is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to unsigned 64-bit integers as necessary.Binary-string evaluation produces a binary string of the same length as the arguments. If the arguments have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned 64-bit integer.
For more information, see the introductory discussion in this section.
mysql>
SELECT 29 | 15;
-> 31 mysql>SELECT _binary X'40404040' | X'01020304';
-> 'ABCD'Bitwise AND.
The result type depends on whether the arguments are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have a binary string type, and at least one of them is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to unsigned 64-bit integers as necessary.Binary-string evaluation produces a binary string of the same length as the arguments. If the arguments have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned 64-bit integer.
For more information, see the introductory discussion in this section.
mysql>
SELECT 29 & 15;
-> 13 mysql>SELECT HEX(_binary X'FF' & b'11110000');
-> 'F0'Bitwise XOR.
The result type depends on whether the arguments are evaluated as binary strings or numbers:
Binary-string evaluation occurs when the arguments have a binary string type, and at least one of them is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to unsigned 64-bit integers as necessary.Binary-string evaluation produces a binary string of the same length as the arguments. If the arguments have unequal lengths, an
ER_INVALID_BITWISE_OPERANDS_SIZE
error occurs. Numeric evaluation produces an unsigned 64-bit integer.
For more information, see the introductory discussion in this section.
mysql>
SELECT 1 ^ 1;
-> 0 mysql>SELECT 1 ^ 0;
-> 1 mysql>SELECT 11 ^ 3;
-> 8 mysql>SELECT HEX(_binary X'FEDC' ^ X'1111');
-> 'EFCD'Shifts a longlong (
BIGINT
) number or binary string to the left.The result type depends on whether the bit argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument has a binary string type, and is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to an unsigned 64-bit integer as necessary.Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
Bits shifted off the end of the value are lost without warning, regardless of the argument type. In particular, if the shift count is greater or equal to the number of bits in the bit argument, all bits in the result are 0.
For more information, see the introductory discussion in this section.
mysql>
SELECT 1 << 2;
-> 4 mysql>SELECT HEX(_binary X'00FF00FF00FF' << 8);
-> 'FF00FF00FF00'Shifts a longlong (
BIGINT
) number or binary string to the right.The result type depends on whether the bit argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument has a binary string type, and is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to an unsigned 64-bit integer as necessary.Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
Bits shifted off the end of the value are lost without warning, regardless of the argument type. In particular, if the shift count is greater or equal to the number of bits in the bit argument, all bits in the result are 0.
For more information, see the introductory discussion in this section.
mysql>
SELECT 4 >> 2;
-> 1 mysql>SELECT HEX(_binary X'00FF00FF00FF' >> 8);
-> '0000FF00FF00'Invert all bits.
The result type depends on whether the bit argument is evaluated as a binary string or number:
Binary-string evaluation occurs when the bit argument has a binary string type, and is not a hexadecimal literal, bit literal, or
NULL
literal. Numeric evaluation occurs otherwise, with argument conversion to an unsigned 64-bit integer as necessary.Binary-string evaluation produces a binary string of the same length as the bit argument. Numeric evaluation produces an unsigned 64-bit integer.
For more information, see the introductory discussion in this section.
mysql>
SELECT 5 & ~1;
-> 4 mysql>SELECT HEX(~X'0000FFFF1111EEEE');
-> 'FFFF0000EEEE1111'Returns the number of bits that are set in the argument
N
as an unsigned 64-bit integer, orNULL
if the argument isNULL
.mysql>
SELECT BIT_COUNT(64), BIT_COUNT(BINARY 64);
-> 1, 7 mysql>SELECT BIT_COUNT('64'), BIT_COUNT(_binary '64');
-> 1, 7 mysql>SELECT BIT_COUNT(X'40'), BIT_COUNT(_binary X'40');
-> 1, 1
Table 12.18 Encryption Functions
Name | Description |
---|---|
AES_DECRYPT() |
Decrypt using AES |
AES_ENCRYPT() |
Encrypt using AES |
COMPRESS() |
Return result as a binary string |
MD5() |
Calculate MD5 checksum |
RANDOM_BYTES() |
Return a random byte vector |
SHA1() , SHA() |
Calculate an SHA-1 160-bit checksum |
SHA2() |
Calculate an SHA-2 checksum |
STATEMENT_DIGEST() |
Compute statement digest hash value |
STATEMENT_DIGEST_TEXT() |
Compute normalized statement digest |
UNCOMPRESS() |
Uncompress a string compressed |
UNCOMPRESSED_LENGTH() |
Return the length of a string before compression |
VALIDATE_PASSWORD_STRENGTH() |
Determine strength of password |
Many encryption and compression functions return strings for which
the result might contain arbitrary byte values. If you want to
store these results, use a column with a
VARBINARY
or
BLOB
binary string data type. This
avoids potential problems with trailing space removal or character
set conversion that would change data values, such as may occur if
you use a nonbinary string data type
(CHAR
,
VARCHAR
,
TEXT
).
Some encryption functions return strings of ASCII characters:
MD5()
,
SHA()
,
SHA1()
,
SHA2()
,
STATEMENT_DIGEST()
,
STATEMENT_DIGEST_TEXT()
. Their
return value is a string that has a character set and collation
determined by the
character_set_connection
and
collation_connection
system
variables. This is a nonbinary string unless the character set is
binary
.
If an application stores values from a function such as
MD5()
or
SHA1()
that returns a string of hex
digits, more efficient storage and comparisons can be obtained by
converting the hex representation to binary using
UNHEX()
and storing the result in a
BINARY(
column. Each pair of hexadecimal digits requires one byte in
binary form, so the value of N
)N
depends
on the length of the hex string. N
is
16 for an MD5()
value and 20 for a
SHA1()
value. For
SHA2()
,
N
ranges from 28 to 32 depending on the
argument specifying the desired bit length of the result.
The size penalty for storing the hex string in a
CHAR
column is at least two times,
up to eight times if the value is stored in a column that uses the
utf8
character set (where each character uses 4
bytes). Storing the string also results in slower comparisons
because of the larger values and the need to take character set
collation rules into account.
Suppose that an application stores
MD5()
string values in a
CHAR(32)
column:
CREATE TABLE md5_tbl (md5_val CHAR(32), ...); INSERT INTO md5_tbl (md5_val, ...) VALUES(MD5('abcdef'), ...);
To convert hex strings to more compact form, modify the
application to use UNHEX()
and
BINARY(16)
instead as follows:
CREATE TABLE md5_tbl (md5_val BINARY(16), ...); INSERT INTO md5_tbl (md5_val, ...) VALUES(UNHEX(MD5('abcdef')), ...);
Applications should be prepared to handle the very rare case that a hashing function produces the same value for two different input values. One way to make collisions detectable is to make the hash column a primary key.
Exploits for the MD5 and SHA-1 algorithms have become known. You
may wish to consider using another one-way encryption function
described in this section instead, such as
SHA2()
.
Passwords or other sensitive values supplied as arguments to encryption functions are sent as cleartext to the MySQL server unless an SSL connection is used. Also, such values appear in any MySQL logs to which they are written. To avoid these types of exposure, applications can encrypt sensitive values on the client side before sending them to the server. The same considerations apply to encryption keys. To avoid exposing these, applications can use stored procedures to encrypt and decrypt values on the server side.
AES_DECRYPT(
crypt_str
,key_str
[,init_vector
])This function decrypts data using the official AES (Advanced Encryption Standard) algorithm. For more information, see the description of
AES_ENCRYPT()
.Statements that use
AES_DECRYPT()
are unsafe for statement-based replication.AES_ENCRYPT(
str
,key_str
[,init_vector
])AES_ENCRYPT()
andAES_DECRYPT()
implement encryption and decryption of data using the official AES (Advanced Encryption Standard) algorithm, previously known as “Rijndael.” The AES standard permits various key lengths. By default these functions implement AES with a 128-bit key length. Key lengths of 196 or 256 bits can be used, as described later. The key length is a trade off between performance and security.AES_ENCRYPT()
encrypts the stringstr
using the key stringkey_str
and returns a binary string containing the encrypted output.AES_DECRYPT()
decrypts the encrypted stringcrypt_str
using the key stringkey_str
and returns the original plaintext string. If either function argument isNULL
, the function returnsNULL
.The
str
andcrypt_str
arguments can be any length, and padding is automatically added tostr
so it is a multiple of a block as required by block-based algorithms such as AES. This padding is automatically removed by theAES_DECRYPT()
function. The length ofcrypt_str
can be calculated using this formula:16 * (trunc(
string_length
/ 16) + 1)For a key length of 128 bits, the most secure way to pass a key to the
key_str
argument is to create a truly random 128-bit value and pass it as a binary value. For example:INSERT INTO t VALUES (1,AES_ENCRYPT('text',UNHEX('F3229A0B371ED2D9441B830D21A390C3')));
A passphrase can be used to generate an AES key by hashing the passphrase. For example:
INSERT INTO t VALUES (1,AES_ENCRYPT('text', UNHEX(SHA2('My secret passphrase',512))));
Do not pass a password or passphrase directly to
crypt_str
, hash it first. Previous versions of this documentation suggested the former approach, but it is no longer recommended as the examples shown here are more secure.If
AES_DECRYPT()
detects invalid data or incorrect padding, it returnsNULL
. However, it is possible forAES_DECRYPT()
to return a non-NULL
value (possibly garbage) if the input data or the key is invalid.AES_ENCRYPT()
andAES_DECRYPT()
permit control of the block encryption mode and take an optionalinit_vector
initialization vector argument:The
block_encryption_mode
system variable controls the mode for block-based encryption algorithms. Its default value isaes-128-ecb
, which signifies encryption using a key length of 128 bits and ECB mode. For a description of the permitted values of this variable, see Section 5.1.8, “Server System Variables”.The optional
init_vector
argument provides an initialization vector for block encryption modes that require it.
For modes that require the optional
init_vector
argument, it must be 16 bytes or longer (bytes in excess of 16 are ignored). An error occurs ifinit_vector
is missing.For modes that do not require
init_vector
, it is ignored and a warning is generated if it is specified.A random string of bytes to use for the initialization vector can be produced by calling
RANDOM_BYTES(16)
. For encryption modes that require an initialization vector, the same vector must be used for encryption and decryption.mysql>
SET block_encryption_mode = 'aes-256-cbc';
mysql>SET @key_str = SHA2('My secret passphrase',512);
mysql>SET @init_vector = RANDOM_BYTES(16);
mysql>SET @crypt_str = AES_ENCRYPT('text',@key_str,@init_vector);
mysql>SELECT AES_DECRYPT(@crypt_str,@key_str,@init_vector);
+-----------------------------------------------+ | AES_DECRYPT(@crypt_str,@key_str,@init_vector) | +-----------------------------------------------+ | text | +-----------------------------------------------+The following table lists each permitted block encryption mode and whether the initialization vector argument is required.
Block Encryption Mode Initialization Vector Required ECB No CBC Yes CFB1 Yes CFB8 Yes CFB128 Yes OFB Yes Statements that use
AES_ENCRYPT()
orAES_DECRYPT()
are unsafe for statement-based replication.Compresses a string and returns the result as a binary string. This function requires MySQL to have been compiled with a compression library such as
zlib
. Otherwise, the return value is alwaysNULL
. The compressed string can be uncompressed withUNCOMPRESS()
.mysql>
SELECT LENGTH(COMPRESS(REPEAT('a',1000)));
-> 21 mysql>SELECT LENGTH(COMPRESS(''));
-> 0 mysql>SELECT LENGTH(COMPRESS('a'));
-> 13 mysql>SELECT LENGTH(COMPRESS(REPEAT('a',16)));
-> 15The compressed string contents are stored the following way:
Empty strings are stored as empty strings.
Nonempty strings are stored as a 4-byte length of the uncompressed string (low byte first), followed by the compressed string. If the string ends with space, an extra
.
character is added to avoid problems with endspace trimming should the result be stored in aCHAR
orVARCHAR
column. (However, use of nonbinary string data types such asCHAR
orVARCHAR
to store compressed strings is not recommended anyway because character set conversion may occur. Use aVARBINARY
orBLOB
binary string column instead.)
Calculates an MD5 128-bit checksum for the string. The value is returned as a string of 32 hexadecimal digits, or
NULL
if the argument wasNULL
. The return value can, for example, be used as a hash key. See the notes at the beginning of this section about storing hash values efficiently.The return value is a string in the connection character set.
If FIPS mode is enabled,
MD5()
returnsNULL
. See Section 6.8, “FIPS Support”.mysql>
SELECT MD5('testing');
-> 'ae2b1fca515949e5d54fb22b8ed95575'This is the “RSA Data Security, Inc. MD5 Message-Digest Algorithm.”
See the note regarding the MD5 algorithm at the beginning this section.
This function returns a binary string of
len
random bytes generated using the random number generator of the SSL library. Permitted values oflen
range from 1 to 1024. For values outside that range, an error occurs.RANDOM_BYTES()
can be used to provide the initialization vector for theAES_DECRYPT()
andAES_ENCRYPT()
functions. For use in that context,len
must be at least 16. Larger values are permitted, but bytes in excess of 16 are ignored.RANDOM_BYTES()
generates a random value, which makes its result nondeterministic. Consequently, statements that use this function are unsafe for statement-based replication.Calculates an SHA-1 160-bit checksum for the string, as described in RFC 3174 (Secure Hash Algorithm). The value is returned as a string of 40 hexadecimal digits, or
NULL
if the argument wasNULL
. One of the possible uses for this function is as a hash key. See the notes at the beginning of this section about storing hash values efficiently.SHA()
is synonymous withSHA1()
.The return value is a string in the connection character set.
mysql>
SELECT SHA1('abc');
-> 'a9993e364706816aba3e25717850c26c9cd0d89d'SHA1()
can be considered a cryptographically more secure equivalent ofMD5()
. However, see the note regarding the MD5 and SHA-1 algorithms at the beginning this section.Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). The first argument is the plaintext string to be hashed. The second argument indicates the desired bit length of the result, which must have a value of 224, 256, 384, 512, or 0 (which is equivalent to 256). If either argument is
NULL
or the hash length is not one of the permitted values, the return value isNULL
. Otherwise, the function result is a hash value containing the desired number of bits. See the notes at the beginning of this section about storing hash values efficiently.The return value is a string in the connection character set.
mysql>
SELECT SHA2('abc', 224);
-> '23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7'This function works only if MySQL has been configured with SSL support. See Section 6.3, “Using Encrypted Connections”.
SHA2()
can be considered cryptographically more secure thanMD5()
orSHA1()
.Given an SQL statement as a string, returns the statement digest hash value as a string in the connection character set, or
NULL
if the argument isNULL
. The relatedSTATEMENT_DIGEST_TEXT()
function returns the normalized statement digest. For information about statement digesting, see Section 27.10, “Performance Schema Statement Digests and Sampling”.Both functions use the MySQL parser to parse the statement. If parsing fails, an error occurs. The error message includes the parse error only if the statement is provided as a literal string.
The
max_digest_length
system variable determines the maximum number of bytes available to these functions for computing normalized statement digests.mysql>
SET @stmt = 'SELECT * FROM mytable WHERE cola = 10 AND colb = 20';
mysql>SELECT STATEMENT_DIGEST(@stmt);
+------------------------------------------------------------------+ | STATEMENT_DIGEST(@stmt) | +------------------------------------------------------------------+ | 3bb95eeade896657c4526e74ff2a2862039d0a0fe8a9e7155b5fe492cbd78387 | +------------------------------------------------------------------+ mysql>SELECT STATEMENT_DIGEST_TEXT(@stmt);
+----------------------------------------------------------+ | STATEMENT_DIGEST_TEXT(@stmt) | +----------------------------------------------------------+ | SELECT * FROM `mytable` WHERE `cola` = ? AND `colb` = ? | +----------------------------------------------------------+STATEMENT_DIGEST_TEXT(
statement
)Given an SQL statement as a string, returns the normalized statement digest as a string in the connection character set, or
NULL
if the argument isNULL
. For additional discussion and examples, see the description of the relatedSTATEMENT_DIGEST()
function.UNCOMPRESS(
string_to_uncompress
)Uncompresses a string compressed by the
COMPRESS()
function. If the argument is not a compressed value, the result isNULL
. This function requires MySQL to have been compiled with a compression library such aszlib
. Otherwise, the return value is alwaysNULL
.mysql>
SELECT UNCOMPRESS(COMPRESS('any string'));
-> 'any string' mysql>SELECT UNCOMPRESS('any string');
-> NULLUNCOMPRESSED_LENGTH(
compressed_string
)Returns the length that the compressed string had before being compressed.
mysql>
SELECT UNCOMPRESSED_LENGTH(COMPRESS(REPEAT('a',30)));
-> 30VALIDATE_PASSWORD_STRENGTH(
str
)Given an argument representing a plaintext password, this function returns an integer to indicate how strong the password is. The return value ranges from 0 (weak) to 100 (strong).
Password assessment by
VALIDATE_PASSWORD_STRENGTH()
is done by thevalidate_password
component. If that component is not installed, the function always returns 0. For information about installingvalidate_password
, see Section 6.4.3, “The Password Validation Component”. To examine or configure the parameters that affect password testing, check or set the system variables implemented byvalidate_password
. See Section 6.4.3.2, “Password Validation Options and Variables”.The password is subjected to increasingly strict tests and the return value reflects which tests were satisfied, as shown in the following table. In addition, if the
validate_password.check_user_name
system variable is enabled and the password matches the user name,VALIDATE_PASSWORD_STRENGTH()
returns 0 regardless of how othervalidate_password
system variables are set.Password Test Return Value Length < 4 0 Length ≥ 4 and < validate_password.length
25 Satisfies policy 1 ( LOW
)50 Satisfies policy 2 ( MEDIUM
)75 Satisfies policy 3 ( STRONG
)100
This section describes functions used to manipulate user-level locks.
Table 12.19 Locking Functions
Name | Description |
---|---|
GET_LOCK() |
Get a named lock |
IS_FREE_LOCK() |
Whether the named lock is free |
IS_USED_LOCK() |
Whether the named lock is in use; return connection identifier if true |
RELEASE_ALL_LOCKS() |
Release all current named locks |
RELEASE_LOCK() |
Release the named lock |
Tries to obtain a lock with a name given by the string
str
, using a timeout oftimeout
seconds. A negativetimeout
value means infinite timeout. The lock is exclusive. While held by one session, other sessions cannot obtain a lock of the same name.Returns
1
if the lock was obtained successfully,0
if the attempt timed out (for example, because another client has previously locked the name), orNULL
if an error occurred (such as running out of memory or the thread was killed with mysqladmin kill).A lock obtained with
GET_LOCK()
is released explicitly by executingRELEASE_LOCK()
or implicitly when your session terminates (either normally or abnormally). Locks obtained withGET_LOCK()
are not released when transactions commit or roll back.GET_LOCK()
is implemented using the metadata locking (MDL) subsystem. Multiple simultaneous locks can be acquired andGET_LOCK()
does not release any existing locks. For example, suppose that you execute these statements:SELECT GET_LOCK('lock1',10); SELECT GET_LOCK('lock2',10); SELECT RELEASE_LOCK('lock2'); SELECT RELEASE_LOCK('lock1');
The second
GET_LOCK()
acquires a second lock and bothRELEASE_LOCK()
calls return 1 (success).It is even possible for a given session to acquire multiple locks for the same name. Other sessions cannot acquire a lock with that name until the acquiring session releases all its locks for the name.
Uniquely named locks acquired with
GET_LOCK()
appear in the Performance Schemametadata_locks
table. TheOBJECT_TYPE
column saysUSER LEVEL LOCK
and theOBJECT_NAME
column indicates the lock name. In the case that multiple locks are acquired for the same name, only the first lock for the name registers a row in themetadata_locks
table. Subsequent locks for the name increment a counter in the lock but do not acquire additional metadata locks. Themetadata_locks
row for the lock is deleted when the last lock instance on the name is released.The capability of acquiring multiple locks means there is the possibility of deadlock among clients. When this happens, the server chooses a caller and terminates its lock-acquisition request with an
ER_USER_LOCK_DEADLOCK
error. This error does not cause transactions to roll back.MySQL enforces a maximum length on lock names of 64 characters.
GET_LOCK()
can be used to implement application locks or to simulate record locks. Names are locked on a server-wide basis. If a name has been locked within one session,GET_LOCK()
blocks any request by another session for a lock with the same name. This enables clients that agree on a given lock name to use the name to perform cooperative advisory locking. But be aware that it also enables a client that is not among the set of cooperating clients to lock a name, either inadvertently or deliberately, and thus prevent any of the cooperating clients from locking that name. One way to reduce the likelihood of this is to use lock names that are database-specific or application-specific. For example, use lock names of the formdb_name.str
orapp_name.str
.If multiple clients are waiting for a lock, the order in which they acquire it is undefined. Applications should not assume that clients acquire the lock in the same order that they issued the lock requests.
GET_LOCK()
is unsafe for statement-based replication. A warning is logged if you use this function whenbinlog_format
is set toSTATEMENT
.CautionWith the capability of acquiring multiple named locks, it is possible for a single statement to acquire a large number of locks. For example:
INSERT INTO ... SELECT GET_LOCK(t1.col_name) FROM t1;
These types of statements may have certain adverse effects. For example, if the statement fails part way through and rolls back, locks acquired up to the point of failure still exist. If the intent is for there to be a correspondence between rows inserted and locks acquired, that intent is not satisfied. Also, if it is important that locks are granted in a certain order, be aware that result set order may differ depending on which execution plan the optimizer chooses. For these reasons, it may be best to limit applications to a single lock-acquisition call per statement.
A different locking interface is available as either a plugin service or a set of user-defined functions. This interface provides lock namespaces and distinct read and write locks, unlike the interface provided by
GET_LOCK()
and related functions. For details, see Section 5.6.8.1, “The Locking Service”.Checks whether the lock named
str
is free to use (that is, not locked). Returns1
if the lock is free (no one is using the lock),0
if the lock is in use, andNULL
if an error occurs (such as an incorrect argument).This function is unsafe for statement-based replication. A warning is logged if you use this function when
binlog_format
is set toSTATEMENT
.Checks whether the lock named
str
is in use (that is, locked). If so, it returns the connection identifier of the client session that holds the lock. Otherwise, it returnsNULL
.This function is unsafe for statement-based replication. A warning is logged if you use this function when
binlog_format
is set toSTATEMENT
.Releases all named locks held by the current session and returns the number of locks released (0 if there were none)
This function is unsafe for statement-based replication. A warning is logged if you use this function when
binlog_format
is set toSTATEMENT
.Releases the lock named by the string
str
that was obtained withGET_LOCK()
. Returns1
if the lock was released,0
if the lock was not established by this thread (in which case the lock is not released), andNULL
if the named lock did not exist. The lock does not exist if it was never obtained by a call toGET_LOCK()
or if it has previously been released.The
DO
statement is convenient to use withRELEASE_LOCK()
. See Section 13.2.3, “DO Statement”.This function is unsafe for statement-based replication. A warning is logged if you use this function when
binlog_format
is set toSTATEMENT
.
Table 12.20 Information Functions
Name | Description |
---|---|
BENCHMARK() |
Repeatedly execute an expression |
CHARSET() |
Return the character set of the argument |
COERCIBILITY() |
Return the collation coercibility value of the string argument |
COLLATION() |
Return the collation of the string argument |
CONNECTION_ID() |
Return the connection ID (thread ID) for the connection |
CURRENT_ROLE() |
Return the current active roles |
CURRENT_USER() , CURRENT_USER |
The authenticated user name and host name |
DATABASE() |
Return the default (current) database name |
FOUND_ROWS() |
For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause |
ICU_VERSION() |
ICU library version |
LAST_INSERT_ID() |
Value of the AUTOINCREMENT column for the last INSERT |
ROLES_GRAPHML() |
Return a GraphML document representing memory role subgraphs |
ROW_COUNT() |
The number of rows updated |
SCHEMA() |
Synonym for DATABASE() |
SESSION_USER() |
Synonym for USER() |
SYSTEM_USER() |
Synonym for USER() |
USER() |
The user name and host name provided by the client |
VERSION() |
Return a string that indicates the MySQL server version |
The
BENCHMARK()
function executes the expressionexpr
repeatedlycount
times. It may be used to time how quickly MySQL processes the expression. The result value is0
, orNULL
for inappropriate arguments such as aNULL
or negative repeat count.The intended use is from within the mysql client, which reports query execution times:
mysql>
SELECT BENCHMARK(1000000,AES_ENCRYPT('hello','goodbye'));
+---------------------------------------------------+ | BENCHMARK(1000000,AES_ENCRYPT('hello','goodbye')) | +---------------------------------------------------+ | 0 | +---------------------------------------------------+ 1 row in set (4.74 sec)The time reported is elapsed time on the client end, not CPU time on the server end. It is advisable to execute
BENCHMARK()
several times, and to interpret the result with regard to how heavily loaded the server machine is.BENCHMARK()
is intended for measuring the runtime performance of scalar expressions, which has some significant implications for the way that you use it and interpret the results:Only scalar expressions can be used. Although the expression can be a subquery, it must return a single column and at most a single row. For example,
BENCHMARK(10, (SELECT * FROM t))
fails if the tablet
has more than one column or more than one row.Executing a
SELECT
statementexpr
N
times differs from executingSELECT BENCHMARK(
in terms of the amount of overhead involved. The two have very different execution profiles and you should not expect them to take the same amount of time. The former involves the parser, optimizer, table locking, and runtime evaluationN
,expr
)N
times each. The latter involves only runtime evaluationN
times, and all the other components just once. Memory structures already allocated are reused, and runtime optimizations such as local caching of results already evaluated for aggregate functions can alter the results. Use ofBENCHMARK()
thus measures performance of the runtime component by giving more weight to that component and removing the “noise” introduced by the network, parser, optimizer, and so forth.
Returns the character set of the string argument.
mysql>
SELECT CHARSET('abc');
-> 'utf8' mysql>SELECT CHARSET(CONVERT('abc' USING latin1));
-> 'latin1' mysql>SELECT CHARSET(USER());
-> 'utf8'Returns the collation coercibility value of the string argument.
mysql>
SELECT COERCIBILITY('abc' COLLATE utf8_swedish_ci);
-> 0 mysql>SELECT COERCIBILITY(USER());
-> 3 mysql>SELECT COERCIBILITY('abc');
-> 4 mysql>SELECT COERCIBILITY(1000);
-> 5The return values have the meanings shown in the following table. Lower values have higher precedence.
Coercibility Meaning Example 0
Explicit collation Value with COLLATE
clause1
No collation Concatenation of strings with different collations 2
Implicit collation Column value, stored routine parameter or local variable 3
System constant USER()
return value4
Coercible Literal string 5
Numeric Numeric or temporal value 5
Ignorable NULL
or an expression derived fromNULL
For more information, see Section 10.8.4, “Collation Coercibility in Expressions”.
Returns the collation of the string argument.
mysql>
SELECT COLLATION('abc');
-> 'utf8_general_ci' mysql>SELECT COLLATION(_utf8mb4'abc');
-> 'utf8mb4_0900_ai_ci' mysql>SELECT COLLATION(_latin1'abc');
-> 'latin1_swedish_ci'Returns the connection ID (thread ID) for the connection. Every connection has an ID that is unique among the set of currently connected clients.
The value returned by
CONNECTION_ID()
is the same type of value as displayed in theID
column of theINFORMATION_SCHEMA.PROCESSLIST
table, theId
column ofSHOW PROCESSLIST
output, and thePROCESSLIST_ID
column of the Performance Schemathreads
table.mysql>
SELECT CONNECTION_ID();
-> 23786WarningChanging the session value of the
pseudo_thread_id
system variable changes the value returned by theCONNECTION_ID()
function.Returns a
utf8
string containing the current active roles for the current session, separated by commas, orNONE
if there are none. The value reflects the setting of thesql_quote_show_create
system variable.Suppose that an account is granted roles as follows:
GRANT 'r1', 'r2' TO 'u1'@'localhost'; SET DEFAULT ROLE ALL TO 'u1'@'localhost';
In sessions for
u1
, the initialCURRENT_ROLE()
value names the default account roles. UsingSET ROLE
changes that:mysql>
SELECT CURRENT_ROLE();
+-------------------+ | CURRENT_ROLE() | +-------------------+ | `r1`@`%`,`r2`@`%` | +-------------------+ mysql>SET ROLE 'r1'; SELECT CURRENT_ROLE();
+----------------+ | CURRENT_ROLE() | +----------------+ | `r1`@`%` | +----------------+Returns the user name and host name combination for the MySQL account that the server used to authenticate the current client. This account determines your access privileges. The return value is a string in the
utf8
character set.The value of
CURRENT_USER()
can differ from the value ofUSER()
.mysql>
SELECT USER();
-> 'davida@localhost' mysql>SELECT * FROM mysql.user;
ERROR 1044: Access denied for user ''@'localhost' to database 'mysql' mysql>SELECT CURRENT_USER();
-> '@localhost'The example illustrates that although the client specified a user name of
davida
(as indicated by the value of theUSER()
function), the server authenticated the client using an anonymous user account (as seen by the empty user name part of theCURRENT_USER()
value). One way this might occur is that there is no account listed in the grant tables fordavida
.Within a stored program or view,
CURRENT_USER()
returns the account for the user who defined the object (as given by itsDEFINER
value) unless defined with theSQL SECURITY INVOKER
characteristic. In the latter case,CURRENT_USER()
returns the object's invoker.Triggers and events have no option to define the
SQL SECURITY
characteristic, so for these objects,CURRENT_USER()
returns the account for the user who defined the object. To return the invoker, useUSER()
orSESSION_USER()
.The following statements support use of the
CURRENT_USER()
function to take the place of the name of (and, possibly, a host for) an affected user or a definer; in such cases,CURRENT_USER()
is expanded where and as needed:For information about the implications that this expansion of
CURRENT_USER()
has for replication, see Section 17.5.1.8, “Replication of CURRENT_USER()”.Returns the default (current) database name as a string in the
utf8
character set. If there is no default database,DATABASE()
returnsNULL
. Within a stored routine, the default database is the database that the routine is associated with, which is not necessarily the same as the database that is the default in the calling context.mysql>
SELECT DATABASE();
-> 'test'If there is no default database,
DATABASE()
returnsNULL
.-
Note
The
SQL_CALC_FOUND_ROWS
query modifier and accompanyingFOUND_ROWS()
function are deprecated as of MySQL 8.0.17; expect them to be removed in a future version of MySQL. As a replacement, considering executing your query withLIMIT
, and then a second query withCOUNT(*)
and withoutLIMIT
to determine whether there are additional rows. For example, instead of these queries:SELECT SQL_CALC_FOUND_ROWS * FROM
tbl_name
WHERE id > 100 LIMIT 10; SELECT FOUND_ROWS();Use these queries instead:
SELECT * FROM
tbl_name
WHERE id > 100 LIMIT 10; SELECT COUNT(*) FROMtbl_name
WHERE id > 100;COUNT(*)
is subject to certain optimizations.SQL_CALC_FOUND_ROWS
causes some optimizations to be disabled.A
SELECT
statement may include aLIMIT
clause to restrict the number of rows the server returns to the client. In some cases, it is desirable to know how many rows the statement would have returned without theLIMIT
, but without running the statement again. To obtain this row count, include anSQL_CALC_FOUND_ROWS
option in theSELECT
statement, and then invokeFOUND_ROWS()
afterward:mysql>
SELECT SQL_CALC_FOUND_ROWS * FROM
->tbl_name
WHERE id > 100 LIMIT 10;
mysql>SELECT FOUND_ROWS();
The second
SELECT
returns a number indicating how many rows the firstSELECT
would have returned had it been written without theLIMIT
clause.In the absence of the
SQL_CALC_FOUND_ROWS
option in the most recent successfulSELECT
statement,FOUND_ROWS()
returns the number of rows in the result set returned by that statement. If the statement includes aLIMIT
clause,FOUND_ROWS()
returns the number of rows up to the limit. For example,FOUND_ROWS()
returns 10 or 60, respectively, if the statement includesLIMIT 10
orLIMIT 50, 10
.The row count available through
FOUND_ROWS()
is transient and not intended to be available past the statement following theSELECT SQL_CALC_FOUND_ROWS
statement. If you need to refer to the value later, save it:mysql>
SELECT SQL_CALC_FOUND_ROWS * FROM ... ;
mysql>SET @rows = FOUND_ROWS();
If you are using
SELECT SQL_CALC_FOUND_ROWS
, MySQL must calculate how many rows are in the full result set. However, this is faster than running the query again withoutLIMIT
, because the result set need not be sent to the client.SQL_CALC_FOUND_ROWS
andFOUND_ROWS()
can be useful in situations when you want to restrict the number of rows that a query returns, but also determine the number of rows in the full result set without running the query again. An example is a Web script that presents a paged display containing links to the pages that show other sections of a search result. UsingFOUND_ROWS()
enables you to determine how many other pages are needed for the rest of the result.The use of
SQL_CALC_FOUND_ROWS
andFOUND_ROWS()
is more complex forUNION
statements than for simpleSELECT
statements, becauseLIMIT
may occur at multiple places in aUNION
. It may be applied to individualSELECT
statements in theUNION
, or global to theUNION
result as a whole.The intent of
SQL_CALC_FOUND_ROWS
forUNION
is that it should return the row count that would be returned without a globalLIMIT
. The conditions for use ofSQL_CALC_FOUND_ROWS
withUNION
are:The
SQL_CALC_FOUND_ROWS
keyword must appear in the firstSELECT
of theUNION
.The value of
FOUND_ROWS()
is exact only ifUNION ALL
is used. IfUNION
withoutALL
is used, duplicate removal occurs and the value ofFOUND_ROWS()
is only approximate.If no
LIMIT
is present in theUNION
,SQL_CALC_FOUND_ROWS
is ignored and returns the number of rows in the temporary table that is created to process theUNION
.
Beyond the cases described here, the behavior of
FOUND_ROWS()
is undefined (for example, its value following aSELECT
statement that fails with an error).ImportantFOUND_ROWS()
is not replicated reliably using statement-based replication. This function is automatically replicated using row-based replication. The version of the International Components for Unicode (ICU) library used to support regular expression operations (see Section 12.8.2, “Regular Expressions”). This function is primarily intended for use in test cases.
LAST_INSERT_ID()
,LAST_INSERT_ID(
expr
)With no argument,
LAST_INSERT_ID()
returns aBIGINT UNSIGNED
(64-bit) value representing the first automatically generated value successfully inserted for anAUTO_INCREMENT
column as a result of the most recently executedINSERT
statement. The value ofLAST_INSERT_ID()
remains unchanged if no rows are successfully inserted.With an argument,
LAST_INSERT_ID()
returns an unsigned integer.For example, after inserting a row that generates an
AUTO_INCREMENT
value, you can get the value like this:mysql>
SELECT LAST_INSERT_ID();
-> 195The currently executing statement does not affect the value of
LAST_INSERT_ID()
. Suppose that you generate anAUTO_INCREMENT
value with one statement, and then refer toLAST_INSERT_ID()
in a multiple-rowINSERT
statement that inserts rows into a table with its ownAUTO_INCREMENT
column. The value ofLAST_INSERT_ID()
remains stable in the second statement; its value for the second and later rows is not affected by the earlier row insertions. (You should be aware that, if you mix references toLAST_INSERT_ID()
andLAST_INSERT_ID(
, the effect is undefined.)expr
)If the previous statement returned an error, the value of
LAST_INSERT_ID()
is undefined. For transactional tables, if the statement is rolled back due to an error, the value ofLAST_INSERT_ID()
is left undefined. For manualROLLBACK
, the value ofLAST_INSERT_ID()
is not restored to that before the transaction; it remains as it was at the point of theROLLBACK
.Within the body of a stored routine (procedure or function) or a trigger, the value of
LAST_INSERT_ID()
changes the same way as for statements executed outside the body of these kinds of objects. The effect of a stored routine or trigger upon the value ofLAST_INSERT_ID()
that is seen by following statements depends on the kind of routine:If a stored procedure executes statements that change the value of
LAST_INSERT_ID()
, the changed value is seen by statements that follow the procedure call.For stored functions and triggers that change the value, the value is restored when the function or trigger ends, so statements coming after it do not see a changed value.
The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first
AUTO_INCREMENT
value generated for most recent statement affecting anAUTO_INCREMENT
column by that client. This value cannot be affected by other clients, even if they generateAUTO_INCREMENT
values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.The value of
LAST_INSERT_ID()
is not changed if you set theAUTO_INCREMENT
column of a row to a non-“magic” value (that is, a value that is notNULL
and not0
).ImportantIf you insert multiple rows using a single
INSERT
statement,LAST_INSERT_ID()
returns the value generated for the first inserted row only. The reason for this is to make it possible to reproduce easily the sameINSERT
statement against some other server.For example:
mysql>
USE test;
mysql>CREATE TABLE t (
id INT AUTO_INCREMENT NOT NULL PRIMARY KEY,
name VARCHAR(10) NOT NULL
);
mysql>INSERT INTO t VALUES (NULL, 'Bob');
mysql>SELECT * FROM t;
+----+------+ | id | name | +----+------+ | 1 | Bob | +----+------+ mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 1 | +------------------+ mysql>INSERT INTO t VALUES
(NULL, 'Mary'), (NULL, 'Jane'), (NULL, 'Lisa');
mysql>SELECT * FROM t;
+----+------+ | id | name | +----+------+ | 1 | Bob | | 2 | Mary | | 3 | Jane | | 4 | Lisa | +----+------+ mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 2 | +------------------+Although the second
INSERT
statement inserted three new rows intot
, the ID generated for the first of these rows was2
, and it is this value that is returned byLAST_INSERT_ID()
for the followingSELECT
statement.If you use
INSERT IGNORE
and the row is ignored, theLAST_INSERT_ID()
remains unchanged from the current value (or 0 is returned if the connection has not yet performed a successfulINSERT
) and, for non-transactional tables, theAUTO_INCREMENT
counter is not incremented. ForInnoDB
tables, theAUTO_INCREMENT
counter is incremented ifinnodb_autoinc_lock_mode
is set to1
or2
, as demonstrated in the following example:mysql>
USE test;
mysql>SELECT @@innodb_autoinc_lock_mode;
+----------------------------+ | @@innodb_autoinc_lock_mode | +----------------------------+ | 1 | +----------------------------+ mysql>CREATE TABLE `t` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`val` INT(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `i1` (`val`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
# Insert two rows mysql>INSERT INTO t (val) VALUES (1),(2);
# With auto_increment_offset=1, the inserted rows # result in an AUTO_INCREMENT value of 3 mysql>SHOW CREATE TABLE t\G
*************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, `val` int(11) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `i1` (`val`) ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1 # LAST_INSERT_ID() returns the first automatically generated # value that is successfully inserted for the AUTO_INCREMENT column mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 1 | +------------------+ # The attempted insertion of duplicate rows fail but errors are ignored mysql>INSERT IGNORE INTO t (val) VALUES (1),(2);
Query OK, 0 rows affected (0.00 sec) Records: 2 Duplicates: 2 Warnings: 0 # With innodb_autoinc_lock_mode=1, the AUTO_INCREMENT counter # is incremented for the ignored rows mysql>SHOW CREATE TABLE t\G
*************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, `val` int(11) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `i1` (`val`) ) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1 # The LAST_INSERT_ID is unchanged because the previous insert was unsuccessful mysql>SELECT LAST_INSERT_ID();
+------------------+ | LAST_INSERT_ID() | +------------------+ | 1 | +------------------+For more information, see Section 15.6.1.6, “AUTO_INCREMENT Handling in InnoDB”.
If
expr
is given as an argument toLAST_INSERT_ID()
, the value of the argument