1178. Number of Valid Words for Each Puzzle

置顶

Created At : 2023-10-28 15:14

Count:1.2k Views 👀 :

With respect to a given puzzle string, a word is valid if both the following conditions are satisfied:

word contains the first letter of puzzle.
For each letter in word, that letter is in puzzle.
For example, if the puzzle is “abcdefg”, then valid words are “faced”, “cabbage”, and “baggage”; while invalid words are “beefed” (doesn’t include “a”) and “based” (includes “s” which isn’t in the puzzle).

Return an array answer, where answer[i] is the number of words in the given word list words that are valid with respect to the puzzle puzzles[i].

Example :

Input:
words = ["aaaa","asas","able","ability","actt","actor","access"],
puzzles = ["aboveyz","abrodyz","abslute","absoryz","actresz","gaswxyz"]
Output: [1,1,3,2,4,0]
Explanation:
1 valid word for "aboveyz" : "aaaa"
1 valid word for "abrodyz" : "aaaa"
3 valid words for "abslute" : "aaaa", "asas", "able"
2 valid words for "absoryz" : "aaaa", "asas"
4 valid words for "actresz" : "aaaa", "asas", "actt", "access"
There're no valid words for "gaswxyz" cause none of the words in the list contains letter 'g'.

Constraints:

1 <= words.length <= 10^5
4 <= words[i].length <= 50
1 <= puzzles.length <= 10^4
puzzles[i].length == 7
words[i][j], puzzles[i][j] are English lowercase letters.
Each puzzles[i] doesn’t contain repeated characters.

这道题说对于一个 puzzle 字符串，当满足两个条件就表示某个 word 是合法的。第一个是当 word 包含 puzzle 的首字母，第二个是对于 word 中的所有字母，均在 puzzle 中出现（这里不考虑次数，只考虑字母种类）。现在给了一个单词数组，和一个谜语数组，问对于每个 puzzle，各有多少个单词是合法的。这道题的题目挺长的，但是好在给的例子有详细的解释，题意并不难理解。大家能想到的最简单暴力的破解方法就是对于每个 puzzle，都遍历一遍所有的单词，去验证给的两个条件是否成立。但这种方法根本对不住本题的 Hard 身价，会被 OJ 无情拒绝，所以必须想更高效巧妙的方法。先从两个条件入手，第一个需要包含 puzzle 的首字母，这个没啥说的，很好满足，主要是第二个条件，word 中所有字母都需要在 puzzle 中出现，暴力搜索中的遍历会很耗时。因为这里不考虑出现次数，所以不用建立次数映射，而只是考虑字母种类的话，就可以用位操作 Bit Manipulation 来做，因为只有 26 个字母，若用每一个位来表示对应的字母，1表示存在，0表示不存在，那么最多只需要大小为 2^26 - 1 的整型数就能表示所有的情况。这样每个单词都可以用一个 mask 来表示了，注意不同的单词可能会有相同的 mask，比如 apple 和 pale，但不影响做题，因为字母的次数和顺序都无关紧要。由于这里关心的是相同 mask 的个数，则可以用个 HashMap 来建立 mask 和其出现次数之间的映射，方便之后的查询。

接下里就要开始处理 puzzles 数组了，遍历每个 puzzle 字符串，同样需要统计其字母种类，即算出 mask。跟 puzzle 的 mask 相同的单词肯定都是满足题意的，但不仅仅是相同的，还有 mask 的严格子集也是满足题意的，比如 puzzle 是 apple 的话，那么 word 可以是 apple，也可以是 apple 的子集，比如 app，ap，al 等等，只要包含首字母就可以了。那么如何求子集的 mask 呢，由于子集是减少字母的，所以 mask 值一定比原来的小，所以可以每次减1，但由于只能将原来1的位置变为0，而不能把0变为1，所以还需要 ‘与’ 上原来的 mask。这样就可以使用一个 while 循环，若当前 sub（即为 mask）’与’ 上 first 还是 first，且 sub 在 maskMap 中存在，则将其的映射值累加到 cnt。若此时 sub 为0，则 break 掉循环，否则 sub 自减1并 ‘与’ 上原 mask 即可，参见代码如下：

class Solution {
public:
    vector<int> findNumOfValidWords(vector<string>& words, vector<string>& puzzles) {
        vector<int> res;
        unordered_map<int, int> maskMap;
        for (string word : words) {
            int mask = 0;
            for (char c : word) {
                mask |= 1 << (c - 'a');
            }
            ++maskMap[mask];
        }
        for (string puzzle : puzzles) {
            int mask = 0;
            for (char c : puzzle) {
                mask |= 1 << (c - 'a');
            }
            int sub = mask, cnt = 0, first = 1 << (puzzle[0] - 'a');
            while (true) {
                if ((sub & first) == first && maskMap.count(sub)) {
                    cnt += maskMap[sub];
                }
                if (sub == 0) break;
                sub = (sub - 1) & mask;
            }
            res.push_back(cnt);
        }
        return res;
    }
};

Github 同步地址:

https://github.com/grandyang/leetcode/issues/1178

参考资料：

https://leetcode.com/problems/number-of-valid-words-for-each-puzzle/

https://leetcode.com/problems/number-of-valid-words-for-each-puzzle/discuss/372385/Java-Bit-manipulation-%2B-Map-Solution-90ms

https://leetcode.com/problems/number-of-valid-words-for-each-puzzle/discuss/372145/Python-Bit-manipulation-detailed-explanation

LeetCode All in One 题目讲解汇总(持续更新中…)

转载请注明来源于 Grandyang 的博客 (grandyang.com)，欢迎对文章中的引用来源进行考证，欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论，也可以邮件至 grandyang@qq.com