Skip to content

30. Substring with Concatenation of All Words

题目

You are given a string s and an array of strings words. All the strings of words are of the same length.

A concatenated string is a string that exactly contains all the strings of any permutation of words concatenated.

  • For example, if words = ["ab","cd","ef"], then "abcdef", "abefcd", "cdabef", "cdefab", "efabcd", and "efcdab" are all concatenated strings. "acdbef" is not a concatenated string because it is not the concatenation of any permutation of words.

Return an array of the starting indices of all the concatenated substrings in s. You can return the answer in any order.

 

Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"]

Output: [0,9]

Explanation:

The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.
The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.

Example 2:

Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]

Output: []

Explanation:

There is no concatenated substring.

Example 3:

Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]

Output: [6,9,12]

Explanation:

The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"].
The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"].
The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"].

 

Constraints:

  • 1 <= s.length <= 104
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 30
  • s and words[i] consist of lowercase English letters.
Related Topics
  • 哈希表
  • 字符串
  • 滑动窗口

  • 👍 1233
  • 👎 0
  • 思路

    字符串匹配可以采用滑动窗口的思路,本题是词组异位,可以先定向匹配总词组串,在总字符串相匹配的条件下,再匹配子字符串 当总字符+子串匹配+ 子串 Counter 匹配,便可以确定是否出现唯一的解

    解法

    python
    
    from collections import defaultdict, Counter
    from typing import List
    
    class Solution:
        def findSubstring(self, s: str, words: List[str]) -> List[int]:
            word_dict = defaultdict(int)
    
            word_size = len(words[0])
            flatten_words = "".join(words)
            window_size = len(flatten_words)
            words_cnt =  Counter(flatten_words)
            for word in words:
                word_dict[word] += 1
    
            if window_size > len(s):
                return []
    
            left = 0
            right = 0
            res = []
            while right < len(s):
    
                if s[right] not in words_cnt:
                    right += 1
                    left = right
                elif right - left + 1 == window_size:
                    temp = defaultdict(int)
                    for i in range(left, right + 1, word_size):
                        min_window = s[i:i+word_size]
                        if min_window in word_dict:
                            temp[min_window] += 1
                        else:
                            # 此处跳过的步长为1,而非 word_size 因为有可能重复的字符
                            left += 1
                            right += 1
                            break
                    else:
                        if temp == word_dict:
                            res.append(left)
                        left += 1
                        right += 1
                else:
                    right += 1
    
    
            return res

    复杂度分析

    • 时间复杂度 O(n*m)
      • m = words 的长度
      • k = words 中子串的长度
      • n = s 字符串的长度
      • m + n 到 m + n * m
    • 空间复杂度 O(k*m)