30. Substring with Concatenation of All Words
题目
You are given a string s
and an array of strings words
. All the strings of words
are of the same length.
A concatenated string is a string that exactly contains all the strings of any permutation of words
concatenated.
- For example, if
words = ["ab","cd","ef"]
, then"abcdef"
,"abefcd"
,"cdabef"
,"cdefab"
,"efabcd"
, and"efcdab"
are all concatenated strings."acdbef"
is not a concatenated string because it is not the concatenation of any permutation ofwords
.
Return an array of the starting indices of all the concatenated substrings in s
. You can return the answer in any order.
Example 1:
Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]
Explanation:
The substring starting at 0 is "barfoo"
. It is the concatenation of ["bar","foo"]
which is a permutation of words
.
The substring starting at 9 is "foobar"
. It is the concatenation of ["foo","bar"]
which is a permutation of words
.
Example 2:
Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
Output: []
Explanation:
There is no concatenated substring.
Example 3:
Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
Output: [6,9,12]
Explanation:
The substring starting at 6 is "foobarthe"
. It is the concatenation of ["foo","bar","the"]
.
The substring starting at 9 is "barthefoo"
. It is the concatenation of ["bar","the","foo"]
.
The substring starting at 12 is "thefoobar"
. It is the concatenation of ["the","foo","bar"]
.
Constraints:
1 <= s.length <= 104
1 <= words.length <= 5000
1 <= words[i].length <= 30
s
andwords[i]
consist of lowercase English letters.
思路
字符串匹配可以采用滑动窗口的思路,本题是词组异位,可以先定向匹配总词组串,在总字符串相匹配的条件下,再匹配子字符串 当总字符+子串匹配+ 子串 Counter 匹配,便可以确定是否出现唯一的解
解法
from collections import defaultdict, Counter
from typing import List
class Solution:
def findSubstring(self, s: str, words: List[str]) -> List[int]:
word_dict = defaultdict(int)
word_size = len(words[0])
flatten_words = "".join(words)
window_size = len(flatten_words)
words_cnt = Counter(flatten_words)
for word in words:
word_dict[word] += 1
if window_size > len(s):
return []
left = 0
right = 0
res = []
while right < len(s):
if s[right] not in words_cnt:
right += 1
left = right
elif right - left + 1 == window_size:
temp = defaultdict(int)
for i in range(left, right + 1, word_size):
min_window = s[i:i+word_size]
if min_window in word_dict:
temp[min_window] += 1
else:
# 此处跳过的步长为1,而非 word_size 因为有可能重复的字符
left += 1
right += 1
break
else:
if temp == word_dict:
res.append(left)
left += 1
right += 1
else:
right += 1
return res
复杂度分析
- 时间复杂度 O(n*m)
- m = words 的长度
- k = words 中子串的长度
- n = s 字符串的长度
- m + n 到 m + n * m
- 空间复杂度 O(k*m)