欧博百家乐Difference between String trim() and strip()

strip() looks very similar to trim()

Nope, the two commands are very different. And they overlap. See table below. They trap for different sets of characters, about 2-3 dozen each, with only ten characters in common.

trim targets control codes

strip targets line endings, and typographical spaces

You might want to use both.

String result = input.trim().strip() ; Table

Here is a table showing the characters removed by each command: (Java 1.0+) & (Java 11+).

This table illustrates the definition of whitespace as seen in Java 24. I doubt this has changed from Java 11 through Java 25 (I’ve not checked), but keep in mind that Unicode evolves so the definition of whitespace could change.

Code Point Hex Code Point Dec Name trim strip
U+0000   0   NULL      
U+0001   1   START OF HEADING      
U+0002   2   START OF TEXT      
U+0003   3   END OF TEXT      
U+0004   4   END OF TRANSMISSION      
U+0005   5   ENQUIRY      
U+0006   6   ACKNOWLEDGE      
U+0007   7   BEL      
U+0008   8   BACKSPACE      
U+0009   9   CHARACTER TABULATION      
U+000A   10   LINE FEED (LF)      
U+000B   11   LINE TABULATION      
U+000C   12   FORM FEED (FF)      
U+000D   13   CARRIAGE RETURN (CR)      
U+000E   14   SHIFT OUT      
U+000F   15   SHIFT IN      
U+0010   16   DATA LINK ESCAPE      
U+0011   17   DEVICE CONTROL ONE      
U+0012   18   DEVICE CONTROL TWO      
U+0013   19   DEVICE CONTROL THREE      
U+0014   20   DEVICE CONTROL FOUR      
U+0015   21   NEGATIVE ACKNOWLEDGE      
U+0016   22   SYNCHRONOUS IDLE      
U+0017   23   END OF TRANSMISSION BLOCK      
U+0018   24   CANCEL      
U+0019   25   END OF MEDIUM      
U+001A   26   SUBSTITUTE      
U+001B   27   ESCAPE      
U+001C   28   INFORMATION SEPARATOR FOUR
File Separator
     
U+001D   29   INFORMATION SEPARATOR THREE
Group Separator
     
U+001E   30   INFORMATION SEPARATOR TWO
Record Separator
     
U+001F   31   INFORMATION SEPARATOR ONE
Unit Separator
     
U+0020   32   SPACE      
U+1680   5,760   OGHAM SPACE MARK      
U+2000   8,192   EN QUAD      
U+2001   8,193   EM QUAD      
U+2002   8,194   EN SPACE      
U+2003   8,195   EM SPACE      
U+2004   8,196   THREE-PER-EM SPACE      
U+2005   8,197   FOUR-PER-EM SPACE      
U+2006   8,198   SIX-PER-EM SPACE      
U+2008   8,200   PUNCTUATION SPACE      
U+2009   8,201   THIN SPACE      
U+200A   8,202   HAIR SPACE      
U+2028   8,232   LINE SEPARATOR      
U+2029   8,233   PARAGRAPH SEPARATOR      
U+205F   8,287   MEDIUM MATHEMATICAL SPACE      
U+3000   12,288   IDEOGRAPHIC SPACE      

Be aware that results may vary in future versions of Java. Unicode evolves.

Java 15 added , not considered here in this Answer. By the way, Java 15 also added the ability to un-escape text: .

Code

Here is the code to generate that table.

The strategy is to brute-force check every character in Unicode to see if it is affected by either trim or strip.

We loop through all Unicode code points. Then we filter for a character. (Most Unicode code points are reserved for future or private use, so not assigned a character.)

Then the acid test: We create a string with a single character, the character assigned to that code point by Unicode. We call both & on that string to see if string goes empty. If empty, then we know the command targets that character. We collect each of those targeted code points.

System.out.println ( Runtime.version ( ) ); List < Integer > trimCodePoints = new ArrayList <> ( ); List < Integer > stripCodePoints = new ArrayList <> ( ); // Loop every code point in the Unicode range. for ( int codePoint = Character.MIN_CODE_POINT ; codePoint <= Character.MAX_CODE_POINT ; codePoint++ ) { if ( Character.isDefined ( codePoint ) ) // Filter for valid code points. { String input = Character.toString ( codePoint ); if ( input.trim ( ).isEmpty ( ) ) { trimCodePoints.add ( codePoint ); } if ( input.strip ( ).isEmpty ( ) ) { stripCodePoints.add ( codePoint ); } } }

Lastly, we build a Markdown table to report the results here in Stack Overflow.

// Build a table in Markdown. SequencedSet < Integer > codePointsDistinct = new TreeSet <> ( trimCodePoints ); codePointsDistinct.addAll ( stripCodePoints ); StringBuilder sb = new StringBuilder ( ); sb.append ( "| Code Point Hex | Code Point Dec | Name | `trim` | `strip` |" ).append ( "\n" ); sb.append ( "| :---: | ---: | :--- | :---: | :---: |" ).append ( "\n" ); codePointsDistinct.forEach ( codePoint -> { sb .append ( " | " ).append ( String.format ( "U+%04X" , codePoint ) ) .append ( " | " ).append ( String.format ( "%, d" , codePoint ) ) .append ( " | " ).append ( Character.getName ( codePoint ) ) .append ( " | " ).append ( trimCodePoints.contains ( codePoint ) ? "✅" : "❌" ) .append ( " | " ).append ( stripCodePoints.contains ( codePoint ) ? "✅" : "❌" ) .append ( " |" ).append ( "\n" ); } ); System.out.println ( "sb = \n" + sb );

2026-01-31 10:34 点击量:2