Najdłuższy niepowtarzalny ciąg


33

Biorąc pod uwagę ciąg wejściowy, znajdź najdłuższy ciągły podciąg, który nie ma żadnego znaku dwa razy lub więcej. Jeśli istnieje wiele takich podciągów, możesz je wypisać. Jeśli chcesz, możesz założyć, że dane wejściowe mieszczą się w drukowanym zakresie ASCII.

Punktacja

Odpowiedzi zostaną najpierw uszeregowane według długości ich najdłuższego niepowtarzającego się podciągu, a następnie według ich całkowitej długości. Niższe wyniki będą lepsze dla obu kryteriów. W zależności od języka prawdopodobnie będzie to wyglądać jak wyzwanie dla z ograniczeniami źródłowymi.

Banalność

W niektórych językach osiągnięcie wyniku 1, x (lenguage) lub 2, x (Brain-flak i inne Turing Tepitits) jest dość łatwe, jednak istnieją inne języki, w których zminimalizowanie najdłuższego nie powtarzającego się podciągu jest wyzwaniem. Świetnie się bawiłem, zdobywając 2 punkty w Haskell, więc zachęcam do szukania języków, w których to zadanie jest fajne.

Przypadki testowe

"Good morning, Green orb!" -> "ing, Gre"
"fffffffffff" -> "f"
"oiiiiioiiii" -> "io", "oi"
"1234567890"  -> "1234567890"
"11122324455" -> "324"

Zgłoszenie punktacji

Możesz oceniać swoje programy za pomocą następującego fragmentu:


Proponowany przypadek testowy: 11122324455Jonathan Allan zdał sobie sprawę, że moja pierwsza wersja nie obsługiwała go poprawnie.
Dennis

@Dennis Dodano przypadek testowy. Jestem ciekawy, jak to się stało.
Wheat Wizard

2
Wygenerowałem wszystkie podciągi (już posortowane według długości), a następnie zduplikowałem podciągi i zachowałem te, które pozostały. Niestety zmienia to porządek; 11122pojawia się później 324, ale zostaje deduplikowany 12.
Dennis

Zastanawiam się, gdzie jest odpowiedź na spację.
Magic Octopus Urn

Odpowiedzi:


13

C, wynik 2,  747   720  662 bajtów

L  [  1  <<  7  ]  ,  *  q  ,  *  r  ,  l  ,  d  ,  i  ,  c  ,  j  ,  s  ,  t  ,  k  =  1  <<  7  ;  h  (  )  {  q  =  s  +  i  +  j  ++  ;  *  q  %  k  &&  !  L  [  *  q  %  k  ]  ++  &&  h  (  ++  c  )  ;  }  g  (  )  {  q  =  s  +  i  ;  *  q  %  k  ?  z  (  k  )  ,  h  (  j  =  c  =  0  )  ,  c  >  d  &&  (  d  =  c  )  &&  (  l  =  i  )  ,  g  (  ++  i  )  :  0  ;  }  f  (  S  ,  T  )  {  s  =  S  ;  l  =  i  =  d  =  0  ;  g  (  t  =  T  )  ;  p  (  i  =  0  )  ;  }  p  (  )  {  q  =  s  +  l  +  i  ;  r  =  t  +  i  ;  i  ++  <  d  ?  p  (  *  r  =  *  q  )  :  (  *  r  =  0  )  ;  }  z  (  i  )  {  L  [  --  i  ]  =  0  ;  i  &&  z  (  i  )  ;  }

Działa co najmniej na 32-bitowym MinGW (z wyłączonymi optymalizacjami). Nie używa jednego słowa kluczowego.

Działa najwyraźniej na TIO z gcc i clang: Wypróbuj online! (Dzięki @Dennis!)

Zadzwoń z:

int main()
{
    char str[1024];

    f("Good morning, Green orb!", str);
    puts(str);

    f("fffffffffff", str);
    puts(str);

    f("oiiiiioiiii", str);
    puts(str);

    f("1234567890", str);
    puts(str);

    f("L  [  1  <<  7  ]  ,  *  q  ,  *  r  ,  l  ,  d  ,  i  ,  c  ,  j  ,  s  ,  t  ,  k  =  1  <<  7  ;  h  (  )  {  q  =  s  +  i  +  j  ++  ;  *  q  %  k  &&  !  L  [  *  q  %  k  ]  ++  &&  h  (  ++  c  )  ;  }  g  (  )  {  q  =  s  +  i  ;  *  q  %  k  ?  z  (  k  )  ,  h  (  j  =  c  =  0  )  ,  c  >  d  &&  (  d  =  c  )  &&  (  l  =  i  )  ,  g  (  ++  i  )  :  0  ;  }  f  (  S  ,  T  )  {  s  =  S  ;  l  =  i  =  d  =  0  ;  g  (  t  =  T  )  ;  p  (  i  =  0  )  ;  }  p  (  )  {  q  =  s  +  l  +  i  ;  r  =  t  +  i  ;  i  ++  <  d  ?  p  (  *  r  =  *  q  )  :  (  *  r  =  0  )  ;  }  z  (  i  )  {  L  [  --  i  ]  =  0  ;  i  &&  z  (  i  )  ;  }");
    puts(str);
}

Wydajność:

Kod z nieco bardziej czytelnym formatowaniem:

L[1<<7],
*q, *r, l, d, i, c, j, s, t, k=1<<7;

h()
{
    q = s+i+j++;
    *q%k && !L[*q%k]++ && h(++c);
}

g()
{
    q = s+i;
    *q%k ? z(k), h(j=c=0), c>d && (d=c) && (l=i), g(++i) : 0;
}

f(S, T)
{
    s = S;
    l = i = d = 0;
    g(t=T);
    p(i=0);
}

p()
{
    q = s+l+i;
    r = t+i;
    i++<d ? p(*r=*q) : (*r=0);
}

z(i)
{
    L[--i] = 0;
    i && z(i);
}

Można to wykorzystać do wygenerowania odpowiedniego odstępu, aby przejść do formatowania z wynikiem 2: Wypróbuj online!


C, wynik 3, 309 bajtów

i
,
j
,
l
,
c
,
d
;
f
(
\
c\
\
h\
\
a\
\
r
*
s
)
{
\
f\
\
o\
\
r
\
(
i
=
l
=
d
=
0
;
s
[
i
]
;
c
>
d
&&
(
d
=
c
)
&&
(
l
=
i
)
,
++
i
)
\
f\
\
o\
\
r
(
\
c\
\
h\
\
a\
\
r

L
[
\
1\
\
2\
\
8
\
]
=
{
j
=
c
=
0
}
;
s
[
i
+
j
]
&&
!
L
[
s
[
i
+
j
++
]
]
++
;
++
c
)
;
\
w\
\
r\
\
i\
\
t\
\
e
(
1
,
s
+
l
,
d
)
;
}

Wypróbuj online!


10

Haskell , wynik 2, 492 ... 307 224 212 209 207 bajtów

((yy:yyy))??ss|ss==yy  =  ""  |  yy==yy=yy:yyy??ss
ss??sss=ss
ss""=""

ss((ff:fff))  =  ff  :  ss  fff??ff
ff""=""

ff((xxx:xx))  =  ss((xxx:xx))##ff  xx
xx##xxx  |  ((((xx>>xx))<))  $  xxx>>xx=xxx|xx==xx=xx

Wypróbuj online!

Grał w golfa dosłownie setki bajtów dzięki WW i Ørjanowi Johansenowi !

Wyjaśnienie

Funkcja (??)pobiera znak ci ciąg znaków si zwraca najdłuższy prefiks s, który nie zawiera c. Niegolfowany i niezoptymalizowany pod kątem wyniku:

c ?? (y:s)  
    | c==y = ""
    | True = y : c ?? s
c ?? s = s

Funkcja ssużywa (??)do znalezienia najdłuższego prefiksu unikatowych znaków danego łańcucha:

ss (x:r) = x : (x ?? ss r)
ss "" = ""

(##)jest funkcją, która pobiera dwa ciągi i zwraca dłuższą. Porównanie długości polega na powtarzaniu łańcucha xtak często, jak xdługo ( x>>y) i na ydługości ( y>>x) oraz sprawdzaniu, który z powstałych łańcuchów jest leksykograficznie większy.

x ## y
  | (x>>x) < (y>>x) = y
  | True = x

Na koniec ffpowtarza się nad łańcuchem wejściowym, generuje najdłuższy przedrostek z ss, rekurencyjnie określa najdłuższy, nie powtarzający się podłańcuch ogona łańcucha i zwraca dłuższy z dwóch za pomocą (##):

ff "" = ""
ff (x:r) = ss(x:r) ## ff r

4
224, mainly by fusing away the intermediate list.
Ørjan Johansen

2
I combined this answer with the one I posted in chat earlier to get 216.
Wheat Wizard

3
209 by reordering things.
Ørjan Johansen

3
With the bounty announcement I took another look and realized the @ trick actually costs 2 bytes over just making ? two chars: 207
Ørjan Johansen

5

Lua, score 3, 274 bytes

g='g'..'s'..'u'..'b'  _G  [  'l'..'o'..'a'..'d'  ](  g[g  ](  "s  =...f  o  r d = # s - 1 , 0 , - 1 d  o f  or r = 1 , # s - d d  o t = s :s  ub  (r  ,r  +d  )i  f n  ot t:  fi  nd  '(  .)  .*  %1  't  he  n p  ri  nt  (t  )r  et  ur  n en  d e  n  d e  nd  ","  ",""))(...)

Note: Lua 5.2 or Lua 5.3 is required

Usage:

$ lua lnrs.lua "Good morning, Green orb!"
ing, Gre
$ lua lnrs.lua "fffffffffff"
f
$ lua lnrs.lua "oiiiiioiiii"
oi
$ lua lnrs.lua "1234567890"
1234567890
$ lua lnrs.lua "11122324455"
324

Main idea: interleave everything with spaces, insert " " (two spaces) to split long identifiers

Ungolfed code:

g = "gsub"
_G["load"](
   g[g](      -- g[g] == string.gsub - a function for substitution of substrings
      "The source of actual program, but two-space sequences were inserted in some places", 
      "  ",   -- we are replacing all two-space substrings
      ""      -- with an empty string
   )
)(...)

Actual program (after removing all pairs of spaces):

s = ...
for d = #s - 1, 0, -1 do
   for r = 1, #s - d do
      t = s:sub(r, r+d)
      if not t:find"(.).*%1" then
         print(t)
         return
      end
   end
end

BTW, the JS snippet for calculating the score fails on my code.


4

Retina 0.8.2, 37 bytes, score 9

.
$&$'¶
(.)(?<=\1.+).*

O#$^`
$.&
1G`

Try it online! The direct translation of this answer to Retina 1 saves a byte by using N instead of O#. However, if you naïvely golf the Retina 1 answer down to 28 bytes, its score actually rises to 10! Explanation:

.
$&$'¶

Generate all suffixes of the input.

(.)(?<=\1.+).*

For each suffix, take the prefix up to the first duplicated character.

O#$^`
$.&

Sort the remaining strings in reverse order of length (i.e. longest first).

1G`

Take the longest.


4

Jelly, score 2, 14 bytes

Ẇµµff  Q  €  Ṫ

Thanks to @JonathanAllan for score -1, +7 bytes and for noticing a bug.

Try it online!

How it works

Ẇµµff  Q  €  Ṫ  Main link. Argument: s (string)

Ẇ               Window; yield all substrings of s, sorted by length.
 µ              Begin a new chain. Argument: A (array of substrings)
  µ             Begin a new chain. Argument: A (array of substrings)
   f            Filter A by presence in itself. Does nothing.
       Q  €     Unique each; deduplicate all strings in A.
    f           Filter A by presence in the array of deduplicated substrings,
                keeping only substrings composed of unique characters.
             Ṫ  Tail; take the last (longest) kept substring.

4

Clean, score 7 5, 276 bytes

@[ss:s]=rr(pp[][ss:s])((@s))
@s=s
ee x[rr:xx]|e x rr=True=ee x xx
ee x xx=f
f=e'e'' '
e::!  Char  !  Char  ->Bool
e  _ _=  code  {

eqC
}
pp p[r:rr]|ee r p=p=pp(a r p)rr
pp a _=a
a  x[ll:l]=[ll:a x  l]
a l ll=[l]
l[]rr=e'l''l'
l ff[]=f

l[r:rr][ss:ll]=l rr ll
rr x y|l x y=y=x

Try it online! Thanks to @Οurous for showing me that it is possible to call ABC machine code directly from within Clean. This allows to get rid of the previous bottle-neck import which set the minimal score to 7, but needs the keyword code which sets the minimal score to 5 for this approach.

An ungolfed and not score-optimized version of the above code can be found here: Try it online!


Previous version with score 7, 158 154 130 bytes

import  StdEnv  
@[xx:rr]=c(%[][xx:rr])(@rr)
@e=e
c s b|  length  s<  length  b=b=s
%s[xx:r]|  isMember xx s=s= %(s++[xx])r
%r _=r

Try it online!

With the import the score cannot go below 7. Without the import one would need to implement equality on strings or chars without any library functions which is probably not possible, as can be seen in the new version above.


1
You can indeed implement equality using inline ABC, which should reduce the score. I'll come back with a suggested modification later today if you're interested.
Οurous

Eg: char equality: tio.run/##S85JTcz7/…
Οurous

@Ourous A code block with raw ABC instructions, which can be used for primitive functions like integer addition, for linking with C, bypassing the type system... welcome down the rabbit hole! (from cloogle) certainly sounds inviting. I'll look into it tomorrow, thanks for the suggestion!
Laikoni

1
@Οurous Thanks again, with your char equality test the score is now at 5.
Laikoni

Incidentally, you don't need either -IL flag, since nothing is being imported.
Οurous

3

Python 3, score 4, 155 bytes

exec(('l=la''mbd''a f'',e=en''ume''rat''e:m''ax''([f[ j  :k]  for  j,i in e ( f)f''or  k,i in e ( f )if  len  ( { *''f[j'':k]''})==k-''j],''key''=le''n)'))

This defines a function l.

Thanks to @xnor for pointing out that strings of length 3 don't raise the score, saving 32 bytes.

Try it online!


The string can be in chunks of 3, right?
xnor

@xnor Changing the name of the function, indeed. Thanks!
Dennis

3

Brachylog, score 2, 19 bytes

s  ᶠ  l  ᵒ  ≠  ˢ  t

Try it online!

Just a boring old "space everything out" answer. At least I learnt that metapredicates can be spaced away from the predicates and still work (and the (parametric) subscripts and superscripts can't).

s ᶠ - find all substrings of the given string

l ᵒ - order them by their length (ascending by default)

≠ ˢ - select those that have all distinct elements

t - get the tail (last element) of that - the one with the biggest length


2

Pyth, 11 bytes, score 4

-4 score thanks to Dennis

e lD {I# .:

elD{I#.:Q      Full program, inputs "string" from stdin and outputs to stdout
e              The last element of the list generated by taking
      .:Q      All substrings of the input
     #         Filtered for
   {I          Being invariant over deduplicate i.e. being "non-repeating"
 lD            and sorted by length

Try it online!


2

Husk, score 2, 10 bytes

►IIËII≠IIQ

Try it online!

Explanation

The program is equivalent to this:

►Ë≠Q  Implicit input.
   Q  List of substrings.
►     Find one that maximizes:
 Ë    all ordered pairs
  ≠   are inequal.

The built-in Ë evaluates on all ordered pairs of its argument x, and returns length(x)+1 if every result is truthy, otherwise 0. When we maximize this, we find the longest string that has no repeated characters.

In the submission, I just insert the identity function I between each function, twice. Since is the same as Ë, I≠ is the same as and so on, this does not change the semantics. The only danger is that a higher order function could decide to use one of the Is as its argument, but luckily that leads to a type error in our program, so it doesn't happen.


2

Clojure, score 4

#(  let  [N  (fn  [[_ & r]] r) R  (fn  R [f v c]  (if  c (R f (f v (  nth  c 0))  ( N  c)) v)) C  (fn  C  (  [i]  (C (  seq  i) 0)) ( [i  n]  (if i (C ( N  i )  (  inc n)) n)))  J  (fn  [c  i]  (assoc c (C  c) i)) I  (fn  F [f i n R]  (if ( =  (C  R) n) R (F f (f  i) n ( J  R (f  i)))))] ( apply  str  (R ( fn  [a  b] ( if  (< (C  a)  (C  b)) b a )) "" (  for  [k  (I N % (C  % ) [])]  (R  ( fn [ t  c ] ( if ( or ( = t (  str t) ) ((  set t)c))(apply  str t) ( J  t c)))[]k)))))

Oh man this was painful! N implements next, R is reduce, C is count, J is conj (works only for vectors) and I is iterate. apply str is there twice because otherwise "aaaa" input wouldn't return a string but a vector [\a]. Luckily I got to use apply and assoc, I didn't know you could assoc one index beyond a vector's last element :o


I shaved off some space: Try it online!
Ørjan Johansen


1

Python 3, score 4, 317 bytes

exec(('%s'  *58  %(  's=','in','pu','t(',');','pr','in','t(','so','rt','ed','((s','[i',':j',']f','or',' j',' i','n ','ra','ng','e(','1,','le','n(','s)','+1',')f','or',' i',' i','n ','ra','ng','e(','j)','if',' l','en','(s','et','(s','[i',':j',']))','==l','en','(s','[i',':j',']))',',k','ey','=l','en',')[','-1','])')))

Try it online!

Unexeced code:

s=input();print(sorted((s[i:j]for j in range(1,len(s)+1)for i in range(j)if len(set(s[i:j]))==len(s[i:j])),key=len)[-1])

lambda a contains mbda which has score 5, and a function needs return which apparently can't be execed (so takes a score of at least 5 for eturn), so a full program was necessary. It's probably possible to golf down the unexeced code size quite a bit, but I can't see a quick clear improvement.


1

Alice, 40 bytes

/ii..nn$$@@BBww..DD~~FF..!!nn$$KK??oo@@

(Trailing newline)

Try it online!

The instruction pointer moves diagonally in ordinal mode, so only every other character is executed.

i.n$@Bw.D~F.!n$K?o@

i     take input
.n$@  terminate if empty
B     push all nonempty substrings, with the longest on the top of the stack
w     push return address (start main loop)
.     make copy of current substring
D     deduplicate characters
~     swap: this places the original above the deduplicated copy
F     Push the original string if it is a substring of the deduplicated copy
      (which can only happen if they're equal); otherwise push empty string
.!    place a copy on the tape
n$K   if the empty string was pushed, return to start of loop
o     output
@     terminate

1

Perl 6, score: 15 10 8, length: 46 55 62 bytes

{~m:ov/(.+)<!{$0.comb.repeated}>/.max(&chars)}

Test it

{~m:ov/(..*)<!{(($0)).comb.repeated}>{{}}/.max(&chars)}

Test it

{m:ov:i/(..*)<!{(($0)).comb.repeated}>{{}}/.max((&chars)).Str}

Test it

Expanded:

{    # bare block lambda with implicit parameter 「$_」

    m                          # match (implicitly against 「$_」)
    :overlap                   # in every single way possible
    :ignorecase                # add a 「:」 to break up substring
    /

      (..*)                    # match at least one character

      <!{
        (($0)).comb.repeated  # backtrack if there were repeats
      }>

      {{}}                    # anon hash in code block (no-op)
    /

    .max((&chars))            # get the longest

    .Str                      # coerce to a Str (from a Match object)
}

Score of 5 for 88 bytes. There might be a few places to golf bytes though
Jo King

1

Java 8, score 9 (384 B) 7 (401 B)

S -> { int s = 0 , e = 0 , l = 0 , x = 0 , y = 0 , b [ ] = new int [ 256 ] ; for ( ; x <S.  length  & y <S.  length  & l <S.  length  - x ; x ++ ) { b [S[x]] = 1 ; for ( y ++ ; y <S.  length  && b [S[y]] < 1 ; b [S[y ++]] = 1 ) ; if ( l < y - x ) { s = x ; e = y ; l = y - x ; } for ( ; y <S.  length  && x < y & S[x] != S[y  ];)b [S[x ++]] = 0 ; }  String g=""; for( ; s<e ; g+= S[s++]);  return  g;}
  • Initial version. Will go down from here. Score is 9 due to "ubstring ", so substring will be the first part to replace.
  • Score is now 7 due to " length", which I probably won't be able to reduce further.. I doubt it is possible to drop the four uses of length. If it is possible, " eturn" (6) might lower the score by 1 as final improvement, but I guess this is it (except maybe a small reduce in byte-count..)

Try it online.



0

Mathematica, score 11 9

Length@Last@Select[Subsequences[Characters@#],#==DeleteDuplicates  @#&]&

Shaving a couple of bytes off of the longest nonrepeating string by obscuring the function's name:

Length@Last@Select[Subsequences[Characters  @#],#==(  ToExpression@ 
StringJoin@@FromCharacterCode@{{68},{101},{108},{101},{116},{101},{68},{117},
{112},{108},{105},{99},{97},{116},{101},{115}}))@#&]&

0

Kotlin, score: 11 10 9 bytes, length: 227 246 245 bytes

indices
  .flatMap { p -> indices . map { p to p + it } }
  .  filter { (r,i) -> i < length  }
  .map { ( s , a )->substring  (  s,  a  ) }
  .  filter { it  .  groupBy  { it } .  none { ( k , v )->v . size>1 } }
  .maxBy { it.length }

The longest is ubstring, which is 9 chars

It is called like this:

val c = "Good morning, Green orb!"

fun String.c(): String? = indices
    .flatMap { p -> indices . map { p to p + it } }
    .  filter { (r,i) -> i < length  }
    .map { ( s , a )->substring  (  s,  a  ) }
    .  filter { it  .  groupBy  { it } .  none { ( k , v )->v . size>1 } }
    .maxBy { it.length }

fun main(args: Array<String>) {
    val text = """indices
    .flatMap { p -> indices . map { p to p + it } }
    .  filter { (r,i) -> i < length  }
    .map { ( s , a )->substring  (  s,  a  ) }
    .  filter { it  .  groupBy  { it } .  none { ( k , v )->v . size>1 } }
    .maxBy { it.length }"""
    val message = text.c()!!
    println(message)
    println(text.length)
    println(message.length)
    println(c.c())
}

Aren't you able to reduce it to 10 by adding an additional space between roupingBy and {?
Kevin Cruijssen

1
Nice find, I changed the other 11s and and got down to 10
jrtapsell

It is 10 chars, but the longest substring isn't roupingBy (which is 9 chars) but eachCount (with trailing space).
Erik the Outgolfer

roupingBy has a trailing space (visible in the markdown, but the renderer seems to strip it)
jrtapsell

Managed to reduce it to 9, fixing the trimming issue
jrtapsell


0

05AB1E, 22 bytes | Score: 2

Œ  ʒ  D  Ù  Q  }  é  ¤

-1 score + 7 bytes thanks to HeebyJeeby

Try it online!


05AB1E, 15 bytes | Score: 3

Œ ʒ D Ù Q } é ¤

Try it online!


05AB1E, 8 bytes | Score: 8

ŒʒDÙQ}é¤

Try it online!


05AB1E can actually do something rather cheap... adding whitespace into 05AB1E does nothing.

If there is a rule against this, I can also use ´ and like 7 other chars.


1
@HeebyJeebyMan because I'm a moron, got a problem with that?
Magic Octopus Urn

@HeebyJeebyMan kidding haha, thanks for the idea.
Magic Octopus Urn
Korzystając z naszej strony potwierdzasz, że przeczytałeś(-aś) i rozumiesz nasze zasady używania plików cookie i zasady ochrony prywatności.
Licensed under cc by-sa 3.0 with attribution required.