欧博百家乐Difference between String trim() and strip()
strip() looks very similar to trim()
Nope, the two commands are very different. And they overlap. See table below. They trap for different sets of characters, about 2-3 dozen each, with only ten characters in common.
trim targets control codes
strip targets line endings, and typographical spaces
You might want to use both.
String result = input.trim().strip() ; TableHere is a table showing the characters removed by each command: (Java 1.0+) & (Java 11+).
This table illustrates the definition of whitespace as seen in Java 24. I doubt this has changed from Java 11 through Java 25 (I’ve not checked), but keep in mind that Unicode evolves so the definition of whitespace could change.
Code Point Hex
Code Point Dec
Name
trim
strip
U+0000
0
NULL
✅
❌
U+0001
1
START OF HEADING
✅
❌
U+0002
2
START OF TEXT
✅
❌
U+0003
3
END OF TEXT
✅
❌
U+0004
4
END OF TRANSMISSION
✅
❌
U+0005
5
ENQUIRY
✅
❌
U+0006
6
ACKNOWLEDGE
✅
❌
U+0007
7
BEL
✅
❌
U+0008
8
BACKSPACE
✅
❌
U+0009
9
CHARACTER TABULATION
✅
✅
U+000A
10
LINE FEED (LF)
✅
✅
U+000B
11
LINE TABULATION
✅
✅
U+000C
12
FORM FEED (FF)
✅
✅
U+000D
13
CARRIAGE RETURN (CR)
✅
✅
U+000E
14
SHIFT OUT
✅
❌
U+000F
15
SHIFT IN
✅
❌
U+0010
16
DATA LINK ESCAPE
✅
❌
U+0011
17
DEVICE CONTROL ONE
✅
❌
U+0012
18
DEVICE CONTROL TWO
✅
❌
U+0013
19
DEVICE CONTROL THREE
✅
❌
U+0014
20
DEVICE CONTROL FOUR
✅
❌
U+0015
21
NEGATIVE ACKNOWLEDGE
✅
❌
U+0016
22
SYNCHRONOUS IDLE
✅
❌
U+0017
23
END OF TRANSMISSION BLOCK
✅
❌
U+0018
24
CANCEL
✅
❌
U+0019
25
END OF MEDIUM
✅
❌
U+001A
26
SUBSTITUTE
✅
❌
U+001B
27
ESCAPE
✅
❌
U+001C
28
INFORMATION SEPARATOR FOUR
File Separator
✅
✅
U+001D
29
INFORMATION SEPARATOR THREE
Group Separator
✅
✅
U+001E
30
INFORMATION SEPARATOR TWO
Record Separator
✅
✅
U+001F
31
INFORMATION SEPARATOR ONE
Unit Separator
✅
✅
U+0020
32
SPACE
✅
✅
U+1680
5,760
OGHAM SPACE MARK
❌
✅
U+2000
8,192
EN QUAD
❌
✅
U+2001
8,193
EM QUAD
❌
✅
U+2002
8,194
EN SPACE
❌
✅
U+2003
8,195
EM SPACE
❌
✅
U+2004
8,196
THREE-PER-EM SPACE
❌
✅
U+2005
8,197
FOUR-PER-EM SPACE
❌
✅
U+2006
8,198
SIX-PER-EM SPACE
❌
✅
U+2008
8,200
PUNCTUATION SPACE
❌
✅
U+2009
8,201
THIN SPACE
❌
✅
U+200A
8,202
HAIR SPACE
❌
✅
U+2028
8,232
LINE SEPARATOR
❌
✅
U+2029
8,233
PARAGRAPH SEPARATOR
❌
✅
U+205F
8,287
MEDIUM MATHEMATICAL SPACE
❌
✅
U+3000
12,288
IDEOGRAPHIC SPACE
❌
✅
Be aware that results may vary in future versions of Java. Unicode evolves.
Java 15 added , not considered here in this Answer. By the way, Java 15 also added the ability to un-escape text: .
CodeHere is the code to generate that table.
The strategy is to brute-force check every character in Unicode to see if it is affected by either trim or strip.
We loop through all Unicode code points. Then we filter for a character. (Most Unicode code points are reserved for future or private use, so not assigned a character.)
Then the acid test: We create a string with a single character, the character assigned to that code point by Unicode. We call both & on that string to see if string goes empty. If empty, then we know the command targets that character. We collect each of those targeted code points.
System.out.println ( Runtime.version ( ) ); List < Integer > trimCodePoints = new ArrayList <> ( ); List < Integer > stripCodePoints = new ArrayList <> ( ); // Loop every code point in the Unicode range. for ( int codePoint = Character.MIN_CODE_POINT ; codePoint <= Character.MAX_CODE_POINT ; codePoint++ ) { if ( Character.isDefined ( codePoint ) ) // Filter for valid code points. { String input = Character.toString ( codePoint ); if ( input.trim ( ).isEmpty ( ) ) { trimCodePoints.add ( codePoint ); } if ( input.strip ( ).isEmpty ( ) ) { stripCodePoints.add ( codePoint ); } } }Lastly, we build a Markdown table to report the results here in Stack Overflow.
// Build a table in Markdown. SequencedSet < Integer > codePointsDistinct = new TreeSet <> ( trimCodePoints ); codePointsDistinct.addAll ( stripCodePoints ); StringBuilder sb = new StringBuilder ( ); sb.append ( "| Code Point Hex | Code Point Dec | Name | `trim` | `strip` |" ).append ( "\n" ); sb.append ( "| :---: | ---: | :--- | :---: | :---: |" ).append ( "\n" ); codePointsDistinct.forEach ( codePoint -> { sb .append ( " | " ).append ( String.format ( "U+%04X" , codePoint ) ) .append ( " | " ).append ( String.format ( "%, d" , codePoint ) ) .append ( " | " ).append ( Character.getName ( codePoint ) ) .append ( " | " ).append ( trimCodePoints.contains ( codePoint ) ? "✅" : "❌" ) .append ( " | " ).append ( stripCodePoints.contains ( codePoint ) ? "✅" : "❌" ) .append ( " |" ).append ( "\n" ); } ); System.out.println ( "sb = \n" + sb );