Skip to content

Commit de7af21

Browse files
committed
Clear where non-American English appears
Previously, the hook could notify us when non-American English words were used in a commit, but it did not output the exact words that needed correction. Additionally, the warning line number was always set to a constant value of 1. This commit enhances the functionality by providing the exact line number and the corresponding word that triggered the warning. This allows users to quickly and clearly identify which word is causing the issue. Implementation Details: The function 'get_match_position' is designed to find the first occurrence of a target string in a multi-line text. It takes the text, target string, starting line, and starting column as input, and returns the line and column where the target is found. The 'get_all_match_positions' function iterates through multiple target words, using 'get_match_position' to locate each word’s first occurrence. The function maintains a cursor ('start_line' and 'start_col'), ensuring that searches for subsequent words continue from the last found position. The output format is 'target: line,column"', making it easier to pinpoint problematic words. Change-Id: If6cbb943ac5ee5b450486686f32ad55ed1d4f234
1 parent bb50402 commit de7af21

File tree

1 file changed

+64
-1
lines changed

1 file changed

+64
-1
lines changed

scripts/commit-msg.hook

Lines changed: 64 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,54 @@ read_commit_message() {
124124
done < $COMMIT_MSG_FILE
125125
}
126126

127+
### Get the position (line and column) of the first occurrence of a target string.
128+
# Usage: get_match_position "$text" "$target" [start_line] [start_col]
129+
# Parameters:
130+
# text - multiline text to search in.
131+
# target - string to search for.
132+
# start_line - (optional) starting line number (default: 1).
133+
# start_col - (optional) starting column number (default: 1).
134+
get_match_position() {
135+
local text="$1" target="$2"
136+
local start_line="${3:-1}" start_col="${4:-1}"
137+
awk -v t="$target" -v sl="$start_line" -v sc="$start_col" '{
138+
if (NR < sl) next;
139+
if (NR == sl) {
140+
pos = index(substr($0, sc), t);
141+
if (pos) { print NR, pos+sc-1; exit }
142+
} else {
143+
pos = index($0, t);
144+
if (pos) { print NR, pos; exit }
145+
}
146+
}' <<< "$text"
147+
}
148+
149+
### For each target (one per line) in a multiline string,
150+
### search the text and call add_warning with the target's first match.
151+
# Parameters:
152+
# $1 - multiline text to search in.
153+
# $2 - targets (each line is a target string).
154+
get_all_match_positions() {
155+
local text="$1"
156+
local targets="$2"
157+
local start_line=1
158+
local start_col=1
159+
local target result line col
160+
161+
while IFS= read -r target; do
162+
[ -z "$target" ] && continue
163+
result=$(get_match_position "$text" "$target" "$start_line" "$start_col")
164+
165+
[ -z "$result" ] && continue
166+
read line col <<< "$result"
167+
echo "$target: $line,$col"
168+
169+
# Update global progress for subsequent searches.
170+
start_line="$line"
171+
start_col=$((col + 1))
172+
done <<< "$targets"
173+
}
174+
127175
#
128176
# Validate the contents of the commmit msg agains the good commit guidelines.
129177
#
@@ -350,6 +398,8 @@ done
350398

351399
FULL_COMMIT_MSG=$(sed '/^#/d;/^[[:space:]]*$/d;/^[[:space:]]*Change-Id:/d' "$COMMIT_MSG_FILE" | \
352400
sed -E "s@${URL_REGEX#^}@@g")
401+
FULL_COMMIT_MSG_WITH_EMPTY=$(sed '/^#/d;/^[[:space:]]*Change-Id:/d' "$COMMIT_MSG_FILE" | \
402+
sed -E "s@${URL_REGEX#^}@@g")
353403
# Extended list of abusive words (case-insensitive).
354404
# Adjust the list as needed.
355405
ABUSIVE_WORDS_REGEX='\b(fuck|fucking|dick|shit|bitch|asshole|cunt|motherfucker|damn|crap|dumbass|piss)\b'
@@ -372,11 +422,24 @@ done
372422
MSG_FOR_SPELLCHECK=$(echo "$FULL_COMMIT_MSG" | sed -E \
373423
-e "s/(['\"][^'\"]*['\"])//g" \
374424
-e "s/\bcommit[[:space:]]+[0-9a-fA-F]{7,40}\b/commit/g")
425+
MSG_SPELLCHECK_FOR_LINE_FINDING=$(echo "$FULL_COMMIT_MSG_WITH_EMPTY" | sed -E \
426+
-e "s/(['\"][^'\"]*['\"])//g" \
427+
-e "s/\bcommit[[:space:]]+[0-9a-fA-F]{7,40}\b/commit/g")
375428

376429
# Use aspell to list misspelled words according to American English, ignoring quoted text.
377430
MISSPELLED_WORDS=$(echo "$MSG_FOR_SPELLCHECK" | $ASPELL --lang=en --list --home-dir=scripts --personal=aspell-pws)
378431
if [ -n "$MISSPELLED_WORDS" ]; then
379-
add_warning 1 "Avoid using non-American English words"
432+
results=$(get_all_match_positions "$MSG_SPELLCHECK_FOR_LINE_FINDING" "$MISSPELLED_WORDS")
433+
434+
while IFS= read -r result; do
435+
# Expected format: "target: line,column"
436+
local target=$(echo "$result" | cut -d: -f1)
437+
local pos=$(echo "$result" | cut -d: -f2 | tr -d ' ')
438+
local line=$(echo "$pos" | cut -d, -f1)
439+
440+
add_warning "$line" "Avoid using non-American English words: $target"
441+
done <<< "$results"
442+
380443
fi
381444
}
382445

0 commit comments

Comments
 (0)