๐Ÿ“š Study/Baekjoon

[Silver IV] DNA - 1969

์œฐ๊ฐฑ 2024. 11. 27. 19:17

๋ฌธ์ œ

DNA๋ž€ ์–ด๋–ค ์œ ์ „๋ฌผ์งˆ์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ถ„์ž์ด๋‹ค. ์ด DNA๋Š” ์„œ๋กœ ๋‹ค๋ฅธ 4๊ฐ€์ง€์˜ ๋‰ดํด๋ ˆ์˜คํ‹ฐ๋“œ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค(Adenine, Thymine, Guanine, Cytosine). ์šฐ๋ฆฌ๋Š” ์–ด๋–ค DNA์˜ ๋ฌผ์งˆ์„ ํ‘œํ˜„ํ•  ๋•Œ, ์ด DNA๋ฅผ ์ด๋ฃจ๋Š” ๋‰ดํด๋ ˆ์˜คํ‹ฐ๋“œ์˜ ์ฒซ๊ธ€์ž๋ฅผ ๋”ฐ์„œ ํ‘œํ˜„ํ•œ๋‹ค. ๋งŒ์•ฝ์— Thymine-Adenine-Adenine-Cytosine-Thymine-Guanine-Cytosine-Cytosine-Guanine-Adenine-Thymine๋กœ ์ด๋ฃจ์–ด์ง„ DNA๊ฐ€ ์žˆ๋‹ค๊ณ  ํ•˜๋ฉด, “TAACTGCCGAT”๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Hamming Distance๋ž€ ๊ธธ์ด๊ฐ€ ๊ฐ™์€ ๋‘ DNA๊ฐ€ ์žˆ์„ ๋•Œ, ๊ฐ ์œ„์น˜์˜ ๋‰ดํด์˜คํ‹ฐ๋“œ ๋ฌธ์ž๊ฐ€ ๋‹ค๋ฅธ ๊ฒƒ์˜ ๊ฐœ์ˆ˜์ด๋‹ค. ๋งŒ์•ฝ์— “AGCAT"์™€ ”GGAAT"๋Š” ์ฒซ ๋ฒˆ์งธ ๊ธ€์ž์™€ ์„ธ ๋ฒˆ์งธ ๊ธ€์ž๊ฐ€ ๋‹ค๋ฅด๋ฏ€๋กœ Hamming Distance๋Š” 2์ด๋‹ค.

์šฐ๋ฆฌ๊ฐ€ ํ•  ์ผ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. N๊ฐœ์˜ ๊ธธ์ด M์ธ DNA s1, s2, ..., sn๊ฐ€ ์ฃผ์–ด์ ธ ์žˆ์„ ๋•Œ Hamming Distance์˜ ํ•ฉ์ด ๊ฐ€์žฅ ์ž‘์€ DNA s๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ฆ‰, s์™€ s1์˜ Hamming Distance + s์™€ s2์˜ Hamming Distance + s์™€ s3์˜ Hamming Distance ... ์˜ ํ•ฉ์ด ์ตœ์†Œ๊ฐ€ ๋œ๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค.

์ž…๋ ฅ

์ฒซ ์ค„์— DNA์˜ ์ˆ˜ N๊ณผ ๋ฌธ์ž์—ด์˜ ๊ธธ์ด M์ด ์ฃผ์–ด์ง„๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‘˜์งธ ์ค„๋ถ€ํ„ฐ N+1๋ฒˆ์งธ ์ค„๊นŒ์ง€ N๊ฐœ์˜ DNA๊ฐ€ ์ฃผ์–ด์ง„๋‹ค. N์€ 1,000๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ ๊ฐ™์€ ์ž์—ฐ์ˆ˜์ด๊ณ , M์€ 50๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ ๊ฐ™์€ ์ž์—ฐ์ˆ˜์ด๋‹ค.

์ถœ๋ ฅ

์ฒซ์งธ ์ค„์— Hamming Distance์˜ ํ•ฉ์ด ๊ฐ€์žฅ ์ž‘์€ DNA ๋ฅผ ์ถœ๋ ฅํ•˜๊ณ , ๋‘˜์งธ ์ค„์—๋Š” ๊ทธ Hamming Distance์˜ ํ•ฉ์„ ์ถœ๋ ฅํ•˜์‹œ์˜ค. ๊ทธ๋Ÿฌํ•œ DNA๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ ์žˆ์„ ๋•Œ์—๋Š” ์‚ฌ์ „์ˆœ์œผ๋กœ ๊ฐ€์žฅ ์•ž์„œ๋Š” ๊ฒƒ์„ ์ถœ๋ ฅํ•œ๋‹ค.

์˜ˆ์ œ ์ž…๋ ฅ 1 ๋ณต์‚ฌ

5 8
TATGATAC
TAAGCTAC
AAAGATCC
TGAGATAC
TAAGATGT

์˜ˆ์ œ ์ถœ๋ ฅ 1 ๋ณต์‚ฌ

TAAGATAC
7

์˜ˆ์ œ ์ž…๋ ฅ 2 ๋ณต์‚ฌ

4 10
ACGTACGTAC
CCGTACGTAG
GCGTACGTAT
TCGTACGTAA

์˜ˆ์ œ ์ถœ๋ ฅ 2 ๋ณต์‚ฌ

ACGTACGTAA
6

์˜ˆ์ œ ์ž…๋ ฅ 3 ๋ณต์‚ฌ

6 10
ATGTTACCAT
AAGTTACGAT
AACAAAGCAA
AAGTTACCTT
AAGTTACCAA
TACTTACCAA

์˜ˆ์ œ ์ถœ๋ ฅ 3 ๋ณต์‚ฌ

AAGTTACCAA
12

 


# ํ’€์ด ๋ฐฉ๋ฒ•

1. “Hamming Distance๊ฐ€ ์ตœ์†Œ”๋ผ๋Š” ์˜๋ฏธ ํŒŒ์•…ํ•˜๊ธฐ

from ๋ฌธ์ œ) ๋งŒ์•ฝ์— “AGCAT"์™€ ”GGAAT"๋Š” ์ฒซ ๋ฒˆ์งธ ๊ธ€์ž์™€ ์„ธ ๋ฒˆ์งธ ๊ธ€์ž๊ฐ€ ๋‹ค๋ฅด๋ฏ€๋กœ Hamming Distance๋Š” 2์ด๋‹ค.

์„œ๋กœ ๋‹ค๋ฅธ ๋‘ DNA๊ฐ€ ๋น„์Šทํ•œ ๊ตฌ์„ฑ์„ ์ด๋ฃฐ์ˆ˜๋ก Hamming Distance๊ฐ’์€ ์ž‘์•„์ง„๋‹ค๋Š” ๊ฒƒ์„ ์ดํ•ด

2.. “Hamming Distance๊ฐ€ ์ตœ์†Œ”๊ฐ€ ๋˜๊ธฐ ์œ„ํ•ด DNA s์˜ ๊ผด ์˜ˆ์ธกํ•˜๊ธฐ

์ž…๋ ฅ ๋ฐ›์€ N๊ฐœ์˜ DNA์™€ ๋น„์Šทํ•œ ๊ตฌ์„ฑ์„ ์ด๋ฃจ์–ด์•ผ ํ•˜๋ฏ€๋กœ

์—ด๋งˆ๋‹ค ๊ณตํ†ต์ ์œผ๋กœ ๋งŽ์ด ๋‚˜์˜จ ์• (A,T,G,C)๋กœ ๊ตฌ์„ฑ๋˜๋ฉด ๋˜์ง€ ์•Š์„๊นŒ?

3.  ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋– ์˜ฌ๋ฆฌ๊ธฐ

step1) N๊ฐœ์˜ DNA๋“ค์€ ๋ชจ๋‘ ์ž…๋ ฅ ๋ฐ›๊ณ 
step2) ์—ด๋งˆ๋‹ค ๊ฐ๊ฐ์˜ ๋‰ดํด๋ ˆ์˜คํ‹ฐ๋“œ๊ฐ€ ๋ช‡๋ฒˆ ๋‚˜์™”๋Š”์ง€ ์นด์šดํŠธ ํ•œ๋‹ค
step3) ๊ทธ ์ค‘ ๊ฐ€์žฅ ๋งŽ์ด ๋‚˜์˜จ ๋‰ดํด๋ ˆ์˜คํ‹ฐ๋“œ๋ฅผ ๊ณ ๋ฅธ๋‹ค.
step3) ๊ทธ๋ฆฌ๊ณ  ์ด๋“ค๋กœ ๊ตฌ์„ฑ๋œ ์ตœ์ข… DNA s๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.
step4) Hamming Distance ๊ณ„์‚ฐํ•œ๋‹ค. (์ตœ์ข… DNAs ๊ตฌ์„ฑ์œผ๋กœ ์ฑ„ํƒ๋˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ์— ๋Œ€ํ•œ ํšŸ์ˆ˜)

 

 


# ์ฝ”๋“œ

n, m = map(int, input().split())
dna_list = [input().strip() for _ in range(n)]
result = ''
count = 0

for col in range(m):
  cnt = {'A': 0, 'C': 0, 'G': 0, 'T': 0}
  for row in range(n):
    cnt[dna_list[row][col]] += 1
  sorted_cnt = sorted(cnt.items(), key = lambda x: (-x[1],x[0]))
  result += sorted_cnt[0][0]
  count += n - sorted_cnt[0][1]

print(result)
print(count)