Home Blog Data Cleaning Techniques

Data Cleaning in Excel: 8 Techniques You Should Know

📅 February 10, 2026 ⏱️ 13 min read ✍️ Learn Make Easy Team 📊 Data Analysis
Start Free Excel Course → All Articles
📋 Table of Contents

    Bad data is one of the biggest hidden problems in any organisation. Duplicate records inflate your totals. Blank cells break your formulas. Dates stored as text refuse to sort. Numbers with spaces cannot be calculated. And inconsistent spellings — "Mumbai", "mumbai", "MUMBAI" — scatter your pivot tables into chaos.

    The good news: Excel has powerful built-in tools to fix all of it. In this article, we cover the 8 most important data cleaning techniques every Excel user should know — with step-by-step instructions, real formulas, and practical examples.

    Messy spreadsheet data being cleaned and organised
    Messy data costs organisations hours every week. These 8 techniques will save you most of that time.

    Quick Reference — All 8 Techniques

    Click any technique name to jump to its full explanation:

    #TechniqueWhat It FixesKey Tool / Formula
    1Remove DuplicatesDuplicate rows inflating counts and totalsData → Remove Duplicates
    2Handle Blank CellsEmpty cells breaking formulas and Pivot TablesGo To Special → Blanks
    3Fix Text FormattingInconsistent case, extra spaces, hidden charactersTRIM, CLEAN, PROPER, UPPER
    4Convert Text to NumbersNumbers stored as text — SUM returns 0VALUE(), Paste Special × 1
    5Fix Text DatesDates that won't sort, group, or calculateDATEVALUE(), Text to Columns
    6Split & Combine ColumnsFull name in one cell, address all in one columnText to Columns, TEXTJOIN, LEFT/MID/RIGHT
    7Find & Replace + SUBSTITUTEInconsistent values, wrong entries across thousands of rowsCtrl+H, SUBSTITUTE()
    8Validate & Flag Bad DataOut-of-range values, wrong formats, future data entry errorsData Validation, IFERROR, Conditional Formatting
    Technique 01

    🗑️ Remove Duplicate Rows

    The problem: Duplicate rows appear when data is copied from multiple sources, exported twice, or pasted more than once. They inflate totals, double-count customers, and make every report inaccurate.

    The fix — Built-in tool:

    • Select any cell in your data range
    • Data tab → Remove Duplicates
    • Choose which columns define a "duplicate" — check ALL for exact row match, or just Email/ID for logical duplicates
    • Click OK — Excel tells you how many duplicates were removed and how many unique values remain

    The fix — COUNTIF to find duplicates first (safer approach):

    =COUNTIF($A$2:$A$100, A2) ' Count how many times this value appears =IF(COUNTIF($A$2:A2, A2)>1, "Duplicate", "Unique") ' Flag from 2nd occurrence onward

    Filter by "Duplicate" to review before deleting. This is safer than Remove Duplicates because you can verify each one first.

    Real-world example: Customer list collected from 3 events — 2,400 total entries. After Remove Duplicates on the Email column → 1,850 unique customers. Every report was overcounting by 550 people.
    💡 Pro Tip: Always work on a COPY of your data before removing duplicates. Excel keeps the FIRST occurrence and deletes subsequent ones. If the first record has wrong data and a later duplicate is correct, you will lose the correct version.
    Excel spreadsheet with highlighted duplicate data rows
    Duplicates are invisible to the eye but devastating to your reports. Always check before final analysis.
    Technique 02

    ⬜ Handle Blank Cells

    The problem: Blank cells in data cause formulas to error, Pivot Tables to show "(blank)" as a category, AVERAGE to skip rows, and sorting to push empty rows to the bottom (or top).

    Method 1 — Find all blanks at once:

    • Select your data range (e.g., A1:F500)
    • Press Ctrl+G (Go To) → click Special → select Blanks → OK
    • All blank cells are now selected (highlighted in blue)
    • Type your fill value (e.g., 0, "N/A", "Unknown") → press Ctrl+Enter to fill ALL selected blanks at once

    Method 2 — Formula to handle blanks:

    =IF(A2="", "Not Provided", A2) ' Replace blank with text =IF(A2="", 0, A2) ' Replace blank with zero =IFERROR(A2/B2, 0) ' Handle divide-by-blank errors =COUNTA(A2:A100) ' Count non-blank cells only

    Method 3 — Delete entire rows with blanks in key column:

    • Filter the key column (e.g., Customer ID) to show blanks only
    • Select all visible rows → right-click → Delete Row
    • Clear the filter → blank rows are gone
    Real-world example: Attendance report with 200 rows. 18 rows have blank in the "Status" column. Use Ctrl+G → Special → Blanks → type "Absent" → Ctrl+Enter. All 18 blanks filled in 5 seconds instead of scrolling through 200 rows manually.
    💡 Pro Tip: Before any analysis, run =COUNTBLANK(A2:A1000) on each column. Any column with blanks that should not have them is a data quality problem. Fix blanks before building Pivot Tables or reports.
    Technique 03

    🧹 Fix Text Formatting — TRIM, CLEAN, PROPER

    The problem: Text data imported from other systems often has: leading/trailing spaces (" Delhi "), extra spaces between words ("New Delhi"), non-printable hidden characters (from copy-paste or system exports), and inconsistent capitalisation ("MUMBAI", "mumbai", "Mumbai").

    The core formula — always start with this:

    =TRIM(CLEAN(A2)) ' CLEAN removes non-printable characters ' TRIM removes extra spaces (leading, trailing, between words) ' Together: the ultimate text cleaner

    Fix capitalisation:

    =PROPER(TRIM(CLEAN(A2))) ' Capitalises First Letter Of Each Word → "New Delhi" =UPPER(TRIM(CLEAN(A2))) ' ALL CAPS → "NEW DELHI" =LOWER(TRIM(CLEAN(A2))) ' all lowercase → "new delhi"

    After writing formula — paste as values:

    • Copy the formula column → right-click original column → Paste Special → Values only
    • Now delete the formula column — you have clean data in the original column
    Real-world example: CRM export has customer names: " rahul sharma ", "PRIYA PATEL", "ankit verma". After =PROPER(TRIM(CLEAN(A2))): "Rahul Sharma", "Priya Patel", "Ankit Verma". Your VLOOKUP and Pivot Tables now work correctly because "Delhi" and " Delhi " are finally the same value.
    💡 Pro Tip: Use =LEN(A2) before and after TRIM to verify cleaning worked. If =LEN(" Delhi ") = 9 before and =LEN(TRIM(" Delhi ")) = 5 after, you removed 4 invisible space characters. This quick check proves your data is actually clean.
    Technique 04

    🔢 Convert Text to Numbers

    The problem: Numbers look like numbers but behave like text — they align left instead of right, SUM returns 0, AVERAGE skips them, and a green triangle appears in the corner. This happens when data is imported from CSV, copied from a web page, or exported from another system.

    How to identify text numbers:

    • Numbers aligned to the LEFT = stored as text (real numbers align right)
    • Green triangle in top-left corner of cell
    • =SUM(A1:A10) returns 0 even though values are visible
    • =ISNUMBER(A1) returns FALSE

    Fix Method 1 — VALUE() formula:

    =VALUE(A2) ' Converts text "50000" to number 50000 =VALUE(TRIM(A2)) ' Also removes spaces before converting =VALUE(SUBSTITUTE(A2,",","")) ' Remove commas first: "1,000" → 1000

    Fix Method 2 — Paste Special (fastest for many cells):

    • Type 1 in any empty cell → Copy it (Ctrl+C)
    • Select all your text-number cells
    • Paste Special (Ctrl+Alt+V) → Values → Multiply → OK
    • Multiplying by 1 converts text to numbers instantly — no formula needed

    Fix Method 3 — Green triangle shortcut:

    • Select cells with green triangles
    • Click the yellow warning diamond that appears → Convert to Number
    Real-world example: Monthly sales export from ERP system. Amount column shows ₹45,000 but SUM = 0. Reason: amounts exported as text with ₹ symbol. Fix: =VALUE(SUBSTITUTE(SUBSTITUTE(A2,"₹",""),",","")) removes symbol and commas, converts to real number. Now SUM, AVERAGE, and Pivot Tables work correctly.
    💡 Pro Tip: =ISNUMBER(A2) is your diagnostic formula. Run it alongside any column you suspect has text numbers. If it returns FALSE for values that look like numbers, you have a text-number problem. Fix those cells before any analysis.
    Clean data analysis results in Excel dashboard
    Clean data leads to accurate reports. Every minute spent cleaning saves hours of re-working incorrect analysis.
    Technique 05

    📅 Fix Dates Stored as Text

    The problem: Dates imported from other systems are often stored as text — "01/04/2026" as a text string instead of a real Excel date. Text dates: do not sort correctly, cannot be grouped in Pivot Tables, cannot be used in date calculations (DATEDIF, NETWORKDAYS), and appear left-aligned instead of right-aligned.

    How to identify text dates:

    • Dates align to the LEFT = text. Real dates align RIGHT.
    • Grouping in Pivot Table is greyed out
    • =ISNUMBER(A2) returns FALSE for a date cell
    • Sorting gives wrong order (01/01/2026, 01/02/2026 sort as text, not chronologically)

    Fix Method 1 — DATEVALUE() formula:

    =DATEVALUE(A2) ' Converts text date to Excel serial number ' Then format the cell as Date (Ctrl+1) =DATEVALUE(TRIM(A2)) ' Also handles spaces in the text date

    Fix Method 2 — Text to Columns (fastest for whole column):

    • Select the date column
    • Data tab → Text to Columns → Delimited → Next → Next
    • Step 3: select "Date" and choose the format (DMY, MDY, YMD)
    • Click Finish — entire column converts to real dates instantly

    Fix Method 3 — Formula for specific formats:

    ' Text date "15-Apr-2026" → real date =DATE(RIGHT(A2,4), MATCH(MID(A2,4,3), {"Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"}, 0), LEFT(A2,2)) ' Text "20260415" (YYYYMMDD format) → real date =DATE(LEFT(A2,4), MID(A2,5,2), RIGHT(A2,2))
    Real-world example: HR system exports joining dates as text "DD/MM/YYYY". None group in Pivot Table by month. Select the column → Text to Columns → Date → DMY → Finish. All 500 dates instantly become real Excel dates. Pivot Table grouping by Month now works perfectly.
    💡 Pro Tip: After converting text dates to real dates, format them immediately: select the column → Ctrl+1 → Date → choose your preferred format (DD-MMM-YYYY is most readable). Without formatting, converted dates show as a serial number like 46127 which confuses everyone.
    Technique 06

    ✂️ Split and Combine Columns

    The problem: Data often comes with information crammed into one cell — "Rahul Sharma" in one column when you need First and Last Name separately, or an address "123, MG Road, Mumbai, Maharashtra" when you need city and state as separate columns for analysis.

    Split: Text to Columns (fastest for delimiters):

    • Select the column to split
    • Data tab → Text to Columns → Delimited
    • Choose delimiter: Comma, Space, or Custom character
    • Preview shows how the split will look → Finish
    • Excel splits into adjacent columns automatically

    Split: Formula approach (more flexible):

    ' Split "Rahul Sharma" into First and Last Name ' First Name: =LEFT(A2, FIND(" ", A2) - 1) ' Everything before the space ' Last Name: =MID(A2, FIND(" ", A2) + 1, LEN(A2)) ' Everything after the space ' Extract city from "123, MG Road, Mumbai, Maharashtra" =TRIM(MID(A2, FIND("~~", SUBSTITUTE(A2, ",", "~~", 3)) + 1, LEN(A2)))

    Combine: Join columns back together:

    ' Combine First + Last Name with space =A2 & " " & B2 ' Combine with separator =TEXTJOIN(", ", TRUE, A2, B2, C2) ' "Delhi, India, 110001" ' Combine date into sentence ="Report for " & TEXT(A2, "DD-MMM-YYYY")
    Real-world example: Employee database has "Department - Location" in one column: "Sales - Mumbai", "HR - Delhi". Need them separate. Text to Columns → Delimited → custom delimiter " - " → splits into Department column and Location column instantly. Now you can filter, Pivot, and SUMIF each independently.
    💡 Pro Tip: In Excel 365, use TEXTSPLIT() for even more flexibility: =TEXTSPLIT(A2, " - ") splits "Sales - Mumbai" into two cells automatically. It even handles splitting into rows with a second delimiter argument.
    Technique 07

    🔁 Find & Replace + SUBSTITUTE

    The problem: Inconsistent values scattered across thousands of rows — "Mumbai", "Bombay", "mumbai", "MUMBAI" all meaning the same city. Wrong codes like "N/A", "NA", "#N/A" mixed with actual zeros. Old product names that have been renamed. Currency symbols mixed with numbers.

    Find & Replace (Ctrl+H) — for quick bulk replacement:

    • Press Ctrl+H to open Find & Replace dialog
    • Find What: type the wrong value (e.g., "Bombay")
    • Replace With: type the correct value (e.g., "Mumbai")
    • Options → Match Case if needed (to avoid changing "bombay" differently)
    • Replace All → done across the entire sheet in one click
    • Use "Match entire cell contents" to avoid partial replacements

    SUBSTITUTE() — formula approach for specific patterns:

    =SUBSTITUTE(A2, "Bombay", "Mumbai") ' Replace specific text =SUBSTITUTE(A2, " ", "") ' Remove ALL spaces =SUBSTITUTE(A2, "₹", "") ' Remove currency symbol =SUBSTITUTE(A2, ",", "", ) ' Remove all commas =SUBSTITUTE(A2, CHAR(10), " ") ' Replace line break with space ' Nested SUBSTITUTE — remove multiple characters: =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2, "(", ""), ")", ""), "-", "")

    Replace blank-looking cells that are not actually blank:

    ' Cells showing "N/A" or "NA" as text: Ctrl+H → Find: N/A → Replace: (leave empty) → Replace All ' Or with formula: =IF(OR(A2="N/A", A2="NA", A2="-"), "", A2)
    Real-world example: Sales database has Region column with: "North", "north", "NORTH", "Nth", "N". Ctrl+H: Replace "north" with "North" (match case off) catches all case variants. Then Replace "Nth" with "North" and "N" with "North" (match entire cell). Five minutes to standardise 3,000 rows — impossible to do manually.
    💡 Pro Tip: Use Find & Replace with wildcards. In the Find What box, check "Use wildcards" under Options. Now * matches any characters: "Mum*" finds Mumbai, Mumbra, Mumtaz — all starting with Mum. "?at" finds "cat", "bat", "hat" — one character before "at". Powerful for pattern-based cleaning.
    Technique 08

    ✅ Validate & Flag Bad Data

    The problem: Cleaning existing data is reactive — you fix problems after they occur. Validation is proactive — it prevents bad data from entering in the first place. Flagging identifies bad data that already slipped through so you can deal with it before it corrupts your reports.

    Data Validation — prevent bad entries:

    • Select the cells where you want rules
    • Data tab → Data Validation → Settings tab
    • Allow: Whole Number, Decimal, List, Date, Text Length, or Custom formula
    • Input Message: hint shown when user clicks the cell
    • Error Alert: Stop (block entry), Warning (allow with warning), Information
    ' Custom validation formula examples: =AND(A2>=0, A2<=100) ' Only values between 0 and 100 =ISNUMBER(A2) ' Only numbers allowed =LEN(A2)=10 ' Exactly 10 characters (phone numbers) =COUNTIF($B$2:B2, B2)=1 ' No duplicates in column B =NOT(ISBLANK(A2)) ' Cell cannot be left blank

    Flag existing bad data with formulas:

    ' Flag negative values that should not be negative =IF(A2<0, "⚠️ Negative — Check", "OK") ' Flag dates in the future (data entry error) =IF(A2>TODAY(), "⚠️ Future Date", "OK") ' Flag values outside expected range =IF(OR(A2<1000, A2>100000), "⚠️ Check Amount", "OK") ' Flag text where number expected =IF(NOT(ISNUMBER(A2)), "⚠️ Not a Number", "OK") ' Flag duplicates =IF(COUNTIF($A$2:A2, A2)>1, "⚠️ Duplicate", "OK")

    Conditional Formatting to highlight bad data visually:

    • Select the data range → Home → Conditional Formatting → New Rule
    • Use a formula: =A2="" highlights blank cells in red
    • =A2<0 highlights negative values in orange
    • =COUNTIF($A$2:$A$100,A2)>1 highlights all duplicates in yellow
    Real-world example: Expense report form with Amount column. Add Data Validation: Whole Number, between 1 and 500000. Error message: "Amount must be between ₹1 and ₹5,00,000". Now if someone types 5000000 or a negative number, Excel blocks them immediately — before the data is ever saved.
    ⚠️ Important: Data Validation only controls future data entry — it does NOT clean or flag existing data already in the sheet. For existing data, use the formula flagging approach or Conditional Formatting described above. Use both together for complete coverage.

    Bonus: Quick Data Cleaning Checklist

    Before any analysis or reporting, run through this checklist on your dataset:

    CheckHow to VerifyFix If Found
    No duplicate rowsData → Remove Duplicates (check count)Remove Duplicates or flag with COUNTIF
    No blank rowsCtrl+End — data should end at last real rowCtrl+G → Special → Blanks → Delete rows
    No blank cells in key columns=COUNTBLANK(A2:A1000)Fill with 0, "Unknown", or delete row
    Numbers are real numbers=ISNUMBER(A2), check alignmentVALUE(), Paste Special × 1
    Dates are real dates=ISNUMBER(A2), check alignmentDATEVALUE() or Text to Columns
    No extra spaces in text=LEN(A2) vs =LEN(TRIM(A2))=TRIM(CLEAN(A2))
    Consistent text valuesPivot Table on text column — look for duplicatesFind & Replace, SUBSTITUTE, PROPER
    No merged cells in dataSelect all → Format Cells → look for MergeUnmerge All Cells
    Column headers in Row 1Visual checkDelete blank rows above headers
    No total rows inside dataVisual check / sort by columnDelete subtotal rows from within data

    🎯 Summary — 8 Techniques Every Excel User Should Know

    1. Remove Duplicates — Data → Remove Duplicates or COUNTIF to flag first
    2. Handle Blank Cells — Ctrl+G → Special → Blanks → Ctrl+Enter to fill all
    3. Fix Text Formatting — =PROPER(TRIM(CLEAN(A2))) cleans almost everything
    4. Convert Text to Numbers — VALUE() formula or Paste Special × 1
    5. Fix Text Dates — Text to Columns → Date format or DATEVALUE()
    6. Split & Combine Columns — Text to Columns for splits, & and TEXTJOIN for combining
    7. Find & Replace + SUBSTITUTE — Ctrl+H for bulk, SUBSTITUTE() for formula-based
    8. Validate & Flag Bad Data — Data Validation to prevent, Conditional Formatting to highlight

    Frequently Asked Questions

    Why is data cleaning important in Excel?+
    Dirty data causes wrong totals, broken formulas, and incorrect Pivot Tables. Analysts spend 60-80% of time cleaning data — mastering these 8 techniques saves hours every week.
    How do I remove duplicates in Excel?+
    Data tab → Remove Duplicates → choose which columns define a duplicate → click OK. Use =COUNTIF($A$2:$A$100, A2) first to flag and review duplicates safely before deleting.
    How do I fix numbers stored as text in Excel?+
    Three methods: 1) =VALUE(A2) formula. 2) Type 1, copy it, Paste Special → Multiply on text-number cells. 3) Click the yellow green-triangle warning → Convert to Number.
    What is the TRIM function in Excel?+
    TRIM removes leading, trailing, and extra spaces from text. Use =TRIM(CLEAN(A2)) together — removes extra spaces AND hidden non-printable characters.
    How do I convert text dates to real Excel dates?+
    Select column → Data → Text to Columns → Delimited → Next → Next → Date (DMY) → Finish. Or use =DATEVALUE(A2). Real dates align right in the cell; text dates align left.
    📚 Related Resources:  Excel for Beginners  |  Advanced Excel Formulas  |  Pivot Tables vs Power Query  |  Top 10 Excel Formulas

    Want to master Excel from scratch?

    Our free Excel for Beginners course covers all the fundamentals — formulas, formatting, tables, and more — in 30 practical step-by-step lessons.